[RFC] Redesign of the data flow interface

Hello everyone ...

A while back I talked about redesigning the data flow part of the RTT... So
here is the starting point.

Current status
=========

The good
--------------
* one-writer, multiple-readers model. Multiple ports can read one single
write port, but there is no way for multiple write ports to write on the
same read port.
* no weird behaviour in case of multiple buffer connections that read the same
port.
* runtime connection policies. The parameters for each connection is set at
runtime. A default policy can be specified at the read port level in order
to simplify component usage (if you don't know what the component needs,
then just use its default policy)
* whatever the policy is, one can know if a sample has ever been written on
it. I.e. no need to have a "default value" that means "no data" anymore.
* proper connection management: if one side disconnects, then the other side
knows that nobody is connected to them anymore.
* a write ports can optionally "remember" what was the last sample
written on them. If that is enabled, the policy allows to initialize
new connections with the last value written on the port (if there is one).

The bad
--------------
* CORBA not implemented. What I will have in the end is a CORBA
implementation that is on-par with the C++ one: both pull or push models
available, proper connection management, data signalling (the new "event
ports" will work over CORBA)
* thread-safety is not properly taken care of. I'd love to have a proper R/W
lock for connection/disconnection operations, but there is none in RTT
itself. Would there be a problem if I was using the boost implementation ?
* no backward compatibility. That could be implemented easily though.

The ugly
-------------
* compilation times seem to be on par with the old implementation (no
improvement here)

Checking it out
=========

The code is the new_data_flow branch of orocos-rtt on my github account
http://github.com/doudou/orocos-rtt/commits/new_data_flow

It is based on Peter's master of the day before tomorrow.
See the testPort* methods in tests/generictask_test_3.cpp for examples.

To peter: please, PLEASE, *PLEASE* don't push that branch on SVN. I'd like to
keep control on this code (meaning: not having to rebase every two days to
keep up with your updates).

Any constructive comment is welcome

Sylvain

[RFC] Redesign of the data flow interface

> Sounds good. I hope we can keep the policy code as much as possible out of
> the components, because much will depend on the environment in which the
> component will be deployed. But a default won't hurt in most cases.
The policy code *is* out of the components. There is only a default policy,
everything else is set up at runtime.

> > * CORBA not implemented. What I will have in the end is a CORBA
> > implementation that is on-par with the C++ one: both pull or push
> > models available, proper connection management, data signalling (the new
> > "event ports" will work over CORBA)
>
> Up to which point would that be compatible/comparable with the CORBA event
> service ?
I don't plan to use the event service, as you said yourself you had too much
problems setting it up. I plan to use plain method calls, but I was thinking
to decouple the task contexts from the CORBA method call by using a separate
TaskContext that does the message passing between different processes.

> How do you feel about a UDP based 'streaming' data flow protocol
> without data arrival guarantees (send and forget) ?
I feel bad about it. For two reasons: first, one would have to implement it
(which is a lot of work). Second, that is IMO only valid if there is a way to
supervise the communication layer, which means that we would need to write a
full communication framework. Not practical if we don't have someone full time
doing that.

> > * thread-safety is not properly taken care of. I'd love to have a proper
> > R/W lock for connection/disconnection operations, but there is none in
> > RTT itself. Would there be a problem if I was using the boost
> > implementation ?
>
> Locks are fine, but we might need to map to the correct API on LXRT and
> Xenomai. Although they do allow posix locks in non-real-time threads, which
> would allow boost locks in these cases.
Mmmm... Would it be possible to get a Posix-Xenomai-LXRT R/W lock
implementation in Orocos then ? That would avoid a lot of contention problems.
Or would you have other means/ideas about how to do that ?

> > The ugly
> > -------------
> > * compilation times seem to be on par with the old implementation (no
> > improvement here)
>
> There is/may be a solution for that one. The type system is actually the
> key. it already allows to construct objects of type T. We could extend it
> and only 'generate' the template based code in the toolkits, instead of
> each possible component, which is now the case. We have a tremendous amount
> of binary code duplication because of the templates. A factory system could
> solve a lot, but then this would require everyone to define his own toolkit
> (which isn't that hard nowadays, but could use improvements).
I thought about it. What I don't like is that you would *need* to have a
toolkit definition for all types that passes through ports. It is actually
cool, right now, that you can pass whatever type you want internally. Toolkits
are only needed when you get through CORBA or inside property files.

Sylvain

[RFC] Redesign of the data flow interface

On Thursday 05 March 2009 11:39:18 Sylvain Joyeux wrote:
> > Sounds good. I hope we can keep the policy code as much as possible out
> > of the components, because much will depend on the environment in which
> > the component will be deployed. But a default won't hurt in most cases.
>
> The policy code *is* out of the components. There is only a default policy,
> everything else is set up at runtime.

Ok, just asking.

>
> > > * CORBA not implemented. What I will have in the end is a CORBA
> > > implementation that is on-par with the C++ one: both pull or push
> > > models available, proper connection management, data signalling (the
> > > new "event ports" will work over CORBA)
> >
> > Up to which point would that be compatible/comparable with the CORBA
> > event service ?
>
> I don't plan to use the event service, as you said yourself you had too
> much problems setting it up. I plan to use plain method calls, but I was
> thinking to decouple the task contexts from the CORBA method call by using
> a separate TaskContext that does the message passing between different
> processes.

Which would be the proxies ?

>
> > How do you feel about a UDP based 'streaming' data flow protocol
> > without data arrival guarantees (send and forget) ?
>
> I feel bad about it. For two reasons: first, one would have to implement it
> (which is a lot of work). Second, that is IMO only valid if there is a way
> to supervise the communication layer, which means that we would need to
> write a full communication framework. Not practical if we don't have
> someone full time doing that.

I was thinking of stealing such an implementation from somewhere of course.
I don't get the supervisor part of your answer. Do you mean that you need that
in order to detect broken connections ? What's the difference for a reader
between a broken connection, and a producer that stopped producing ? If you
can't distinguish, why do you need to manage it ? (I'm thinking about my
quality-of-data thing again of course, which solves this elegantly imvho ).

I'm also a bit worried about the push/pull differentiation. I thought a writer
would always push to 'remote buffer' and a reader always pulls from 'local
buffer'. But I have the impression that a reader's pull could trigger a writer
to produce a data sample ?
My worry is related to the survival of the data vs buffer difference in your
proposal. I had assumed that all readers had a buffer (of at least depth 1)
and that 'data' had disappeared.

>
> > > * thread-safety is not properly taken care of. I'd love to have a
> > > proper R/W lock for connection/disconnection operations, but there is
> > > none in RTT itself. Would there be a problem if I was using the boost
> > > implementation ?
> >
> > Locks are fine, but we might need to map to the correct API on LXRT and
> > Xenomai. Although they do allow posix locks in non-real-time threads,
> > which would allow boost locks in these cases.
>
> Mmmm... Would it be possible to get a Posix-Xenomai-LXRT R/W lock
> implementation in Orocos then ? That would avoid a lot of contention
> problems. Or would you have other means/ideas about how to do that ?

I'd have to understand first where it would be used before I can answer that.

>
> > > The ugly
> > > -------------
> > > * compilation times seem to be on par with the old implementation (no
> > > improvement here)
> >
> > There is/may be a solution for that one. The type system is actually the
> > key. it already allows to construct objects of type T. We could extend it
> > and only 'generate' the template based code in the toolkits, instead of
> > each possible component, which is now the case. We have a tremendous
> > amount of binary code duplication because of the templates. A factory
> > system could solve a lot, but then this would require everyone to define
> > his own toolkit (which isn't that hard nowadays, but could use
> > improvements).
>
> I thought about it. What I don't like is that you would *need* to have a
> toolkit definition for all types that passes through ports. It is actually
> cool, right now, that you can pass whatever type you want internally.
> Toolkits are only needed when you get through CORBA or inside property
> files.

Maybe you could compile a component with -DUSE_TOOLKITS in order to disable
the C++ template code generation and put a factory in place. So if a component
uses only toolkit types, it could get smaller. It would only work if that code
is inlined though.

Peter

[RFC] Redesign of the data flow interface

> > > > * thread-safety is not properly taken care of. I'd love to have a
> > > > proper R/W lock for connection/disconnection operations, but there is
> > > > none in RTT itself. Would there be a problem if I was using the boost
> > > > implementation ?
> > >
> > > Locks are fine, but we might need to map to the correct API on LXRT and
> > > Xenomai. Although they do allow posix locks in non-real-time threads,
> > > which would allow boost locks in these cases.
> >
> > Mmmm... Would it be possible to get a Posix-Xenomai-LXRT R/W lock
> > implementation in Orocos then ? That would avoid a lot of contention
> > problems. Or would you have other means/ideas about how to do that ?
>
> I'd have to understand first where it would be used before I can answer
> that.

Actually, I just stumbled across ListLockFree .. so I guess I should be using
that instead :)

Sylvain

[RFC] Redesign of the data flow interface

> > > How do you feel about a UDP based 'streaming' data flow protocol
> > > without data arrival guarantees (send and forget) ?
> >
> > I feel bad about it. For two reasons: first, one would have to implement
> > it (which is a lot of work). Second, that is IMO only valid if there is a
> > way to supervise the communication layer, which means that we would need
> > to write a full communication framework. Not practical if we don't have
> > someone full time doing that.
>
> I was thinking of stealing such an implementation from somewhere of course.
Then fine. But keep it separated from the CORBA stuff please, because I
personally won't want to use it.

> I don't get the supervisor part of your answer. Do you mean that you need
> that in order to detect broken connections ? What's the difference for a
> reader between a broken connection, and a producer that stopped producing ?
> If you can't distinguish, why do you need to manage it ? (I'm thinking
> about my quality-of-data thing again of course, which solves this elegantly
> imvho ).
OK, here's my point of view
* the quality-of-data thing cannot be implemented generically, because a
measure of "data quality" is at best data-dependent, at worst needs a
whole-system point of view.
* on a connection-oriented layer (like TCP), one can say "broken connection!"
because the reader or the writer closed the connection (think: crash for
instance). This does not solve WLAN problems (latencies because of bad
connections, so you still need to set up timeouts in critical paths. Or
have a supervision layer that *knows* the whole-system constrains (like: if
there is no communication for more than 300ms, just stop this joint task
because it is useless anyway).

In my opinion, the "supervision" thing is the best solution, and would need to
be applied to CORBA as well. The only caveat: we need a "query" interface,
i.e. a way to export the communication layer status to the outer world. That
is a reason why I also wanted an intermediate component taking care of the
communication in the CORBA case:
* a component stays RT-friendly even when exported through CORBA (the actual
calls to the CORBA layer are done by the CORBA component itself).
* you have a status interface for the communication flow.

> I'm also a bit worried about the push/pull differentiation. I thought a
> writer would always push to 'remote buffer' and a reader always pulls from
> 'local buffer'. But I have the impression that a reader's pull could
> trigger a writer to produce a data sample ?
Nope. That is only having the data-holding element on the writer side (pull)
or reader side (push). This way, what you get is a on-demand transfer of
samples, thus not wasting network bandwidth for connections that are rarely
used (think a 10ms producer and a 500ms map-building for instance).

> My worry is related to the survival of the data vs buffer difference in
> your proposal. I had assumed that all readers had a buffer (of at least
> depth 1) and that 'data' had disappeared.
You could see it that way, except that the current implementation of buffers
are FIFO that throw away samples when they are full. Data connections are ring
buffers of size 1. Give me a ring buffer, and I will replace data connections by
ring buffers (and still keep the buffers because they are a different policy).

> > > > * thread-safety is not properly taken care of. I'd love to have a
> > > > proper R/W lock for connection/disconnection operations, but there is
> > > > none in RTT itself. Would there be a problem if I was using the boost
> > > > implementation ?
> > >
> > > Locks are fine, but we might need to map to the correct API on LXRT and
> > > Xenomai. Although they do allow posix locks in non-real-time threads,
> > > which would allow boost locks in these cases.
> >
> > Mmmm... Would it be possible to get a Posix-Xenomai-LXRT R/W lock
> > implementation in Orocos then ? That would avoid a lot of contention
> > problems. Or would you have other means/ideas about how to do that ?
>
> I'd have to understand first where it would be used before I can answer
> that.
My question is related to making the port-level connection/disconnection code
thread-safe (i.e. modifying the internal implementation of the connection code
in Port classes). I thought R/W locks would be the best because the
connections are often read and seldom modified.

> > > > The ugly
> > > > -------------
> > > > * compilation times seem to be on par with the old implementation
> > > > (no improvement here)
> > >
> > > There is/may be a solution for that one. The type system is actually
> > > the key. it already allows to construct objects of type T. We could
> > > extend it and only 'generate' the template based code in the toolkits,
> > > instead of each possible component, which is now the case. We have a
> > > tremendous amount of binary code duplication because of the templates.
> > > A factory system could solve a lot, but then this would require
> > > everyone to define his own toolkit (which isn't that hard nowadays, but
> > > could use improvements).
> >
> > I thought about it. What I don't like is that you would *need* to have a
> > toolkit definition for all types that passes through ports. It is
> > actually cool, right now, that you can pass whatever type you want
> > internally. Toolkits are only needed when you get through CORBA or inside
> > property files.
>
> Maybe you could compile a component with -DUSE_TOOLKITS in order to disable
> the C++ template code generation and put a factory in place. So if a
> component uses only toolkit types, it could get smaller. It would only work
> if that code is inlined though.
I don't think that a good idea because it is too much of a hassle from the
user point of view (heck, you need to make sure *all* types you're using are
defined in a toolkit). I had a solution in mind, I'll send a separate about it.

And, anyway, you don't need a factory: the TypeInfo *is* (or should be) the
factory.

Sylvain

[RFC] Redesign of the data flow interface

On Monday 09 March 2009 10:52:53 Sylvain Joyeux wrote:
> > > > How do you feel about a UDP based 'streaming' data flow protocol
> > > > without data arrival guarantees (send and forget) ?
> > >
> > > I feel bad about it. For two reasons: first, one would have to
> > > implement it (which is a lot of work). Second, that is IMO only valid
> > > if there is a way to supervise the communication layer, which means
> > > that we would need to write a full communication framework. Not
> > > practical if we don't have someone full time doing that.
> >
> > I was thinking of stealing such an implementation from somewhere of
> > course.
>
> Then fine. But keep it separated from the CORBA stuff please, because I
> personally won't want to use it.

And vice versa !

>
> > I don't get the supervisor part of your answer. Do you mean that you need
> > that in order to detect broken connections ? What's the difference for a
> > reader between a broken connection, and a producer that stopped producing
> > ? If you can't distinguish, why do you need to manage it ? (I'm thinking
> > about my quality-of-data thing again of course, which solves this
> > elegantly imvho ).
>
> OK, here's my point of view
> * the quality-of-data thing cannot be implemented generically, because a
> measure of "data quality" is at best data-dependent, at worst needs a
> whole-system point of view.

I agree. It should be a property of the data itself (ie defined on the
application level) *if* it says something about the 'age' of the data. In case
it says something about the performance of the current data communication
channel, it is a good metric (say samples/sec) and it doesn't require a
connection oriented protocol underneath.

> * on a connection-oriented layer (like TCP), one can say "broken
> connection!" because the reader or the writer closed the connection (think:
> crash for instance). This does not solve WLAN problems (latencies because
> of bad connections, so you still need to set up timeouts in critical paths.
> Or have a supervision layer that *knows* the whole-system constrains (like:
> if there is no communication for more than 300ms, just stop this joint task
> because it is useless anyway).

You're trying to fix the problems you added yourself by using a connection
oriented protocol. In case you design your data flow as a send and forget, the
writer doesn't care if anyone is listening (it's an application architecture
thing), the reader will know no-one is sending (bad quality of incomming
data.) Also keep multicast-style in mind, which is connection-less as well.
Maybe it's possible to unify them in a very basic interface...

>
> In my opinion, the "supervision" thing is the best solution, and would need
> to be applied to CORBA as well. The only caveat: we need a "query"
> interface, i.e. a way to export the communication layer status to the outer
> world. That is a reason why I also wanted an intermediate component taking
> care of the communication in the CORBA case:
> * a component stays RT-friendly even when exported through CORBA (the
> actual calls to the CORBA layer are done by the CORBA component itself). *
> you have a status interface for the communication flow.

You only need this because your DF is connection oriented (I'm repeating
myself..) About the RT-issues, your proxies could have a thread which
dispatches data flow in any solution we can think off.

>
> > I'm also a bit worried about the push/pull differentiation. I thought a
> > writer would always push to 'remote buffer' and a reader always pulls
> > from 'local buffer'. But I have the impression that a reader's pull could
> > trigger a writer to produce a data sample ?
>
> Nope. That is only having the data-holding element on the writer side
> (pull) or reader side (push). This way, what you get is a on-demand
> transfer of samples, thus not wasting network bandwidth for connections
> that are rarely used (think a 10ms producer and a 500ms map-building for
> instance).

I would be inclined to say that if your application architecture has a 10ms
producer and a 500ms consumer, it's an application architecture thing you need
to fix, not something the communication framework needs to work around.
Although I fixed it like that in the same way, I wonder if the application
builder really wants to bother. In connection-less setups, you can only have
push.... keep that in mind.

>
> > My worry is related to the survival of the data vs buffer difference in
> > your proposal. I had assumed that all readers had a buffer (of at least
> > depth 1) and that 'data' had disappeared.
>
> You could see it that way, except that the current implementation of
> buffers are FIFO that throw away samples when they are full. Data
> connections are ring buffers of size 1. Give me a ring buffer, and I will
> replace data connections by ring buffers (and still keep the buffers
> because they are a different policy).

Got your point.

>
> > > > > * thread-safety is not properly taken care of. I'd love to have a
> > > > > proper R/W lock for connection/disconnection operations, but there
> > > > > is none in RTT itself. Would there be a problem if I was using the
> > > > > boost implementation ?
> > > >
> > > > Locks are fine, but we might need to map to the correct API on LXRT
> > > > and Xenomai. Although they do allow posix locks in non-real-time
> > > > threads, which would allow boost locks in these cases.
> > >
> > > Mmmm... Would it be possible to get a Posix-Xenomai-LXRT R/W lock
> > > implementation in Orocos then ? That would avoid a lot of contention
> > > problems. Or would you have other means/ideas about how to do that ?
> >
> > I'd have to understand first where it would be used before I can answer
> > that.
>
> My question is related to making the port-level connection/disconnection
> code thread-safe (i.e. modifying the internal implementation of the
> connection code in Port classes). I thought R/W locks would be the best
> because the connections are often read and seldom modified.

Often read by more than one thread ??? Read-write locks are only usefull in
very multi-threaded applications. connection/disconnection stuff doesn't seem
to fit into that ?

>
> > > > > The ugly
> > > > > -------------
> > > > > * compilation times seem to be on par with the old implementation
> > > > > (no improvement here)
> > > >
> > > > There is/may be a solution for that one. The type system is actually
> > > > the key. it already allows to construct objects of type T. We could
> > > > extend it and only 'generate' the template based code in the
> > > > toolkits, instead of each possible component, which is now the case.
> > > > We have a tremendous amount of binary code duplication because of the
> > > > templates. A factory system could solve a lot, but then this would
> > > > require everyone to define his own toolkit (which isn't that hard
> > > > nowadays, but could use improvements).
> > >
> > > I thought about it. What I don't like is that you would *need* to have
> > > a toolkit definition for all types that passes through ports. It is
> > > actually cool, right now, that you can pass whatever type you want
> > > internally. Toolkits are only needed when you get through CORBA or
> > > inside property files.
> >
> > Maybe you could compile a component with -DUSE_TOOLKITS in order to
> > disable the C++ template code generation and put a factory in place. So
> > if a component uses only toolkit types, it could get smaller. It would
> > only work if that code is inlined though.
>
> I don't think that a good idea because it is too much of a hassle from the
> user point of view (heck, you need to make sure *all* types you're using
> are defined in a toolkit). I had a solution in mind, I'll send a separate
> about it.

I'm betting it's better than mine !

Peter

[RFC] Redesign of the data flow interface

> You're trying to fix the problems you added yourself by using a connection
> oriented protocol. In case you design your data flow as a send and forget,
> the writer doesn't care if anyone is listening (it's an application
> architecture thing), the reader will know no-one is sending (bad quality of
> incomming data.) Also keep multicast-style in mind, which is
> connection-less as well. Maybe it's possible to unify them in a very basic
> interface...

To sum up the current interface:
* there is connect()/disconnect() calls. They are part of the components's
Configuration interface, and
- they don't assume a connection-oriented layer
- they don't assure that data will *be* flowing. Heck the producer in the
connection can have a bug.
- the write port point of view does NOT know specifically WHO is reading
(and the read port does not know specifically WHO is writing). They only
know that *someone* is.
* in the connect() call is specified the actual parameters for the connection.
Right now, there is the following parameters:
- data management type: there is "data" which is a ringbuffer of size 1 and
"buffer" which is a FIFO rejecting data when it is full. One can also
specify the locking mechanism (mutex or lock-free), and the size in case
of a buffer connection.
- pull/push: meaningful only in multi-process connections. In "push" mode
the data management element is on the receiver side (i.e. the writer
does not keep any data internally). In "pull" mode this element is on the
sending side. I.e. the receiver has to actively ask for its transfer.
To have a better support of connection-less protocols, it could be moved
from here into the CORBA layer (because CORBA supports it). Maybe a
better solution than pull/push would be to have in the connection
policies a specification of what is the ratio of samples to get through
(i.e. on this connection, send every samples, on that connection send
1/10 of the samples).
* write is "send and forget" there is INTERNALLY a functionality that allows,
in case you are using a connection-oriented protocol, to destroy
connections when the actual underlying connection is closed. This is
transport-dependent and in no case mandatory.

Sylvain

[RFC] Redesign of the data flow interface

On Tuesday 10 March 2009 10:49:59 Sylvain Joyeux wrote:
> > You're trying to fix the problems you added yourself by using a
> > connection oriented protocol. In case you design your data flow as a send
> > and forget, the writer doesn't care if anyone is listening (it's an
> > application architecture thing), the reader will know no-one is sending
> > (bad quality of incomming data.) Also keep multicast-style in mind, which
> > is
> > connection-less as well. Maybe it's possible to unify them in a very
> > basic interface...
>
> To sum up the current interface:
> * there is connect()/disconnect() calls. They are part of the components's
> Configuration interface, and
> - they don't assume a connection-oriented layer
> - they don't assure that data will *be* flowing. Heck the producer in
> the connection can have a bug.
> - the write port point of view does NOT know specifically WHO is reading
> (and the read port does not know specifically WHO is writing). They
> only know that *someone* is.

Ok. I'll take a look at the connect/disconnect calls, and see how they can
work in topic-oriented middleware (like ROS or the event service work).

> * in the connect() call is specified the actual parameters for the
> connection. Right now, there is the following parameters:
> - data management type: there is "data" which is a ringbuffer of size 1
> and "buffer" which is a FIFO rejecting data when it is full. One can also
> specify the locking mechanism (mutex or lock-free), and the size in case of
> a buffer connection.
> - pull/push: meaningful only in multi-process connections. In "push"
> mode the data management element is on the receiver side (i.e. the writer
> does not keep any data internally). In "pull" mode this element is on the
> sending side. I.e. the receiver has to actively ask for its transfer. To
> have a better support of connection-less protocols, it could be moved from
> here into the CORBA layer (because CORBA supports it). Maybe a better
> solution than pull/push would be to have in the connection policies a
> specification of what is the ratio of samples to get through (i.e. on this
> connection, send every samples, on that connection send 1/10 of the
> samples).

I certainly want only one unified interface atop different transports.

> * write is "send and forget" there is INTERNALLY a functionality that
> allows, in case you are using a connection-oriented protocol, to destroy
> connections when the actual underlying connection is closed. This is
> transport-dependent and in no case mandatory.

Ok.

My summary:
Your design already solves a big part of what is broken in the current data
flow implementation. That's more than enough to merge your stuff in for 2.0 But
I still see only the 'typical Orocos' scenario:
* Component A (writer) is created
* Component B (reader) is created
* Some management component 'C' connects A to B and monitors the connections

While other frameworks work like this:
* Component A (writer) is created, starts pushing data (to a 'topic') when
ready
* Component B (reader) is created, starts processing data when received (from
a 'topic')

Imagine that A and B are two (mobile) robots, which work together. In case of
any network failure, 'C' won't be of any help anymore... unless 'C's
functionality is locally available as well.
The second scenario is on the other hand ultimately decoupled, but as you
wrote, has trade-offs.

I'm actually making two points here
1. Other frameworks work/allow topic based, we need to be very careful not to
exclude interoperation with such frameworks.
2. 'Unmanaged' connections are of big use on very decoupled systems, like
multiple autonomous robot architectures.

I must admit that I have lost myself in the implementation details in the past
discussions. I'd rather agree on these two first, and then see what the
technical consequences are...

Peter

[RFC] Redesign of the data flow interface

On Thursday 19 March 2009 09:02:22 Peter Soetens wrote:
> On Tuesday 10 March 2009 10:49:59 Sylvain Joyeux wrote:
> > > You're trying to fix the problems you added yourself by using a
> > > connection oriented protocol. In case you design your data flow as a
> > > send and forget, the writer doesn't care if anyone is listening (it's
> > > an application architecture thing), the reader will know no-one is
> > > sending (bad quality of incomming data.) Also keep multicast-style in
> > > mind, which is
> > > connection-less as well. Maybe it's possible to unify them in a very
> > > basic interface...
> >
> > To sum up the current interface:
> > * there is connect()/disconnect() calls. They are part of the
> > components's Configuration interface, and
> > - they don't assume a connection-oriented layer
> > - they don't assure that data will *be* flowing. Heck the producer in
> > the connection can have a bug.
> > - the write port point of view does NOT know specifically WHO is
> > reading (and the read port does not know specifically WHO is writing).
> > They only know that *someone* is.
>
> Ok. I'll take a look at the connect/disconnect calls, and see how they can
> work in topic-oriented middleware (like ROS or the event service work).
[snip]
> Your design already solves a big part of what is broken in the current data
> flow implementation. That's more than enough to merge your stuff in for 2.0
> But I still see only the 'typical Orocos' scenario:
> * Component A (writer) is created
> * Component B (reader) is created
> * Some management component 'C' connects A to B and monitors the
> connections
>
> While other frameworks work like this:
> * Component A (writer) is created, starts pushing data (to a 'topic') when
> ready
> * Component B (reader) is created, starts processing data when received
> (from a 'topic')

To boot, those "topic oriented middlewares" *are* actually implemented in
practice through a per-process service discovery mechanisms which tells to the
modules where to send the data. At least for those I know of - Carmen, ROS and
one that people have implemented at DFKI. That would be easily integrated in
the API I propose as a separate service discovery component which takes care
of connection/disconnection.

Note that by "connection-oriented", I do NOT MEAN TCP, it simply means that
INTERNALLY each process knows that it has to send data, and has information as
to the destination. Destination being either a specific host:port pair for IP
networks, or can even be a multicast destination (in which case one does not
know specifically to whom, it simply knows that it sends to a group)

> Imagine that A and B are two (mobile) robots, which work together. In case
> of any network failure, 'C' won't be of any help anymore... unless 'C's
> functionality is locally available as well.

"C" does not have to be a centralized component. It can (and has to) be
decentralized as well. See my PhD thesis for an example ;-). My point of view
is that topic-oriented middlewares (as you call them) are a "false true idea"
that comes back every once in a while. A friend of mind told me that the ROS
people have know a tool that allows to display how the specific data
connections exist between ROS modules. Just because that information is needed
to understand what is going on (and, therefore, in my opinion, to supervise
it).

Now, I'm not saying that the new interface should not allow it of course ...
But as I said it can be easily implemented through a per-process service
discovery mechanism that creates connections when services are made available
and disconnects when they are not available anymore (based, for instance, on
timeouts).

> I'm actually making two points here
> 1. Other frameworks work/allow topic based, we need to be very careful not
> to exclude interoperation with such frameworks.
> 2. 'Unmanaged' connections are of big use on very decoupled systems, like
> multiple autonomous robot architectures.
Example of an (horribly useless but totally unmanaged) connection scheme with
the new interface:
* a convention assigns "port name" + "data type" to topic name
* configuration file assigns topic name to multicast address/port number.
* and a per-process "connection management" component just creates
connections internally.

I think now that it would actually make sense to rename (in the code)
"connections" into "channels".

Implementation-wise, we have the following:
* each write port know a certain number of channels it is attached to.
* on the write port side, each channel is associated to an opaque ID that
allows, later on, to remove that particular channel. In a port-to-port
model, that ID is a unique representation of a read port. The important
point is that one does not know, based on this ID, what is the other end
of the channel. It is simply a mean to remove a specific channel from the
write port's channel list.
* each read port knows whether one channel is attached to it or not.

Why "channel" ? Because in the cases you are talking about, the write port is
not connected to a particular port, but to a group (i.e. "everybody that has
the need for that particular data sample"). Therefore "connection" may be
improper, and "channel" better (from the ports point of view, the channels are
a tunnel in which they push or pull data).

I should note that it would only be a renaming effort. The actual
implementation would not change at all.

Would that be clearer ?

Sylvain

[RFC] Redesign of the data flow interface

On Thursday 19 March 2009 11:51:42 Sylvain Joyeux wrote:
> >
> > While other frameworks work like this:
> > * Component A (writer) is created, starts pushing data (to a 'topic')
> > when ready
> > * Component B (reader) is created, starts processing data when received
> > (from a 'topic')
>
> To boot, those "topic oriented middlewares" *are* actually implemented in
> practice through a per-process service discovery mechanisms which tells to
> the modules where to send the data. At least for those I know of - Carmen,
> ROS and one that people have implemented at DFKI. That would be easily
> integrated in the API I propose as a separate service discovery component
> which takes care of connection/disconnection.

Ok. Got it.

>
> Note that by "connection-oriented", I do NOT MEAN TCP, it simply means that
> INTERNALLY each process knows that it has to send data, and has information
> as to the destination. Destination being either a specific host:port pair
> for IP networks, or can even be a multicast destination (in which case one
> does not know specifically to whom, it simply knows that it sends to a
> group)

I got that point somewhere along the way as well.

>
> > Imagine that A and B are two (mobile) robots, which work together. In
> > case of any network failure, 'C' won't be of any help anymore... unless
> > 'C's functionality is locally available as well.
>
> "C" does not have to be a centralized component. It can (and has to) be
> decentralized as well. See my PhD thesis for an example ;-). My point of
> view is that topic-oriented middlewares (as you call them) are a "false
> true idea" that comes back every once in a while. A friend of mind told me
> that the ROS people have know a tool that allows to display how the
> specific data connections exist between ROS modules. Just because that
> information is needed to understand what is going on (and, therefore, in my
> opinion, to supervise it).

We're lacking that as well in the current code base. You can only see from
inspecting an XML file how data might be flowing. Very limiting.

>
> Now, I'm not saying that the new interface should not allow it of course
> ... But as I said it can be easily implemented through a per-process
> service discovery mechanism that creates connections when services are made
> available and disconnects when they are not available anymore (based, for
> instance, on timeouts).

Ok.

>
> > I'm actually making two points here
> > 1. Other frameworks work/allow topic based, we need to be very careful
> > not to exclude interoperation with such frameworks.
> > 2. 'Unmanaged' connections are of big use on very decoupled systems, like
> > multiple autonomous robot architectures.
>
> Example of an (horribly useless but totally unmanaged) connection scheme
> with the new interface:
> * a convention assigns "port name" + "data type" to topic name
> * configuration file assigns topic name to multicast address/port number.
> * and a per-process "connection management" component just creates
> connections internally.
>
> I think now that it would actually make sense to rename (in the code)
> "connections" into "channels".
>
> Implementation-wise, we have the following:
> * each write port know a certain number of channels it is attached to.
> * on the write port side, each channel is associated to an opaque ID that
> allows, later on, to remove that particular channel. In a port-to-port
> model, that ID is a unique representation of a read port. The important
> point is that one does not know, based on this ID, what is the other end
> of the channel. It is simply a mean to remove a specific channel from
> the write port's channel list.
> * each read port knows whether one channel is attached to it or not.
>
> Why "channel" ? Because in the cases you are talking about, the write port
> is not connected to a particular port, but to a group (i.e. "everybody that
> has the need for that particular data sample"). Therefore "connection" may
> be improper, and "channel" better (from the ports point of view, the
> channels are a tunnel in which they push or pull data).
>
> I should note that it would only be a renaming effort. The actual
> implementation would not change at all.
>
> Would that be clearer ?

Yes. Go with renaming to channels.

Peter

[RFC] Redesign of the data flow interface

What I'm trying to get across is that communication quality constraints are to
be seen from a *system level* point of view, not a module-level point of view.

Of course, the sender does not have to know if its samples go through. Still,
I personally believe it is valuable to know that it is connected to something,
so that you can have the "connection state" of the reader/writer ports into
the configuration interface. Note that I do agree that it is not enough (i.e.
if you do have a connection set up but no samples are going through, you're in
trouble).

Of course, you could have the receiver use timeout-based stuff to know whether
it has new samples or not.

The problem is that, by doing that, you actually embed *in the receiver* a
metric that is part of a whole-system concern.

More specifically, for instance, a SLAM TaskContext could be used "on demand"
on one system (low dynamics system, don't care about the actual update rate,
using reliable transports like TCP) and be constrained on another (high
dynamics robot, update rate must be above 5Hz, using UDP transport with
"smart" retransmission)

An even more interesting behaviour would be to actually *change* how your
system behaves *based on the communication quality*. Again, using the SLAM
example, one could actually leave the SLAM alone (be purely data-driven) and
slowing down if localization is not following. Or stop other CPU-demanding
tasks that are not critical.

In my opinion, given that such a change (moving from high dynamics to low
dynamics) has obvious impacts on the rest of the system (like: will I be at my
goal on time now ?), the decision about how to solve the problem should not be
embedded in the components themselves ... Or those components will become not
reusable.

> I would be inclined to say that if your application architecture has a 10ms
> producer and a 500ms consumer, it's an application architecture thing you
> need to fix, not something the communication framework needs to work
> around.
In my opinion, this is actually not a work around. Stop thinking "one
producer, one consumer" and start thinking "one producer, multiple consumers".
You can have a producer at 10ms (for instance laser scanner), a consumer at
10ms (local safety measure, for instance slowing down the robot if obstacles
are too near), a consumer at 500ms (SLAM) and a consumer at 5s (on-demand
update on a GUI for instance).

> Although I fixed it like that in the same way, I wonder if the
> application builder really wants to bother. In connection-less setups, you
> can only have push.... keep that in mind.
And waste your time marshalling/demarshalling for data you won't use anyway.

> > My question is related to making the port-level connection/disconnection
> > code thread-safe (i.e. modifying the internal implementation of the
> > connection code in Port classes). I thought R/W locks would be the best
> > because the connections are often read and seldom modified.
>
> Often read by more than one thread ??? Read-write locks are only usefull in
> very multi-threaded applications. connection/disconnection stuff doesn't
> seem to fit into that ?

Well, the connection implementation still needs to push data to more than one
consumer ... and therefore needs to iterate on the connection lists (i.e. the
list of consumers). That is reading. Modifying is seldom done.

Sylvain

[RFC] Redesign of the data flow interface

On Tuesday 10 March 2009 10:35:19 Sylvain Joyeux wrote:
> What I'm trying to get across is that communication quality constraints are
> to be seen from a *system level* point of view, not a module-level point of
> view.

That's the optimal case where there is a hierarchy of structured modules.
That's the architecture we see in single robotic systems. Fine, I agree there.
What if your components are truely only peer-to-peer and there is no
mastering, ie all descions must be made locally ? I'm not going to be the
swarm robotics zealot, but previous reasoning in our own 'rigid' architecture,
delivered unflexible designs. I'd like to counter that a bit.

>
> Of course, the sender does not have to know if its samples go through.
> Still, I personally believe it is valuable to know that it is connected to
> something, so that you can have the "connection state" of the reader/writer
> ports into the configuration interface. Note that I do agree that it is not
> enough (i.e. if you do have a connection set up but no samples are going
> through, you're in trouble).
>
> Of course, you could have the receiver use timeout-based stuff to know
> whether it has new samples or not.
>
> The problem is that, by doing that, you actually embed *in the receiver* a
> metric that is part of a whole-system concern.

Maybe you're more worried about embedding connection management code into a
'behavioral' component (mixing it with the algorithm) ? (I believe the four
'C's are bound to come up here). Such code indeed limits reusability, but only
for pure algorithmic components, not for descision making components, which
are already limited reusable (they are indeed system specific).

>
> More specifically, for instance, a SLAM TaskContext could be used "on
> demand" on one system (low dynamics system, don't care about the actual
> update rate, using reliable transports like TCP) and be constrained on
> another (high dynamics robot, update rate must be above 5Hz, using UDP
> transport with "smart" retransmission)
>
> An even more interesting behaviour would be to actually *change* how your
> system behaves *based on the communication quality*. Again, using the SLAM
> example, one could actually leave the SLAM alone (be purely data-driven)
> and slowing down if localization is not following. Or stop other
> CPU-demanding tasks that are not critical.

I'm not sure if we're ready yet for optimisation scenarios.

>
> In my opinion, given that such a change (moving from high dynamics to low
> dynamics) has obvious impacts on the rest of the system (like: will I be at
> my goal on time now ?), the decision about how to solve the problem should
> not be embedded in the components themselves ... Or those components will
> become not reusable.

Totally true, except that some components have the sole purpose to embed
descisions.

>
> > I would be inclined to say that if your application architecture has a
> > 10ms producer and a 500ms consumer, it's an application architecture
> > thing you need to fix, not something the communication framework needs to
> > work around.
>
> In my opinion, this is actually not a work around. Stop thinking "one
> producer, one consumer" and start thinking "one producer, multiple
> consumers". You can have a producer at 10ms (for instance laser scanner), a
> consumer at 10ms (local safety measure, for instance slowing down the robot
> if obstacles are too near), a consumer at 500ms (SLAM) and a consumer at 5s
> (on-demand update on a GUI for instance).

I had this in mind, but I must agree that a push-pull 'preference' should be
available. They are both standard architectural primitives, found in many
systems.

>
> > Although I fixed it like that in the same way, I wonder if the
> > application builder really wants to bother. In connection-less setups,
> > you can only have push.... keep that in mind.
>
> And waste your time marshalling/demarshalling for data you won't use
> anyway.

Yes, but it's a trade-off just like you're making them.

>
> > > My question is related to making the port-level
> > > connection/disconnection code thread-safe (i.e. modifying the internal
> > > implementation of the connection code in Port classes). I thought R/W
> > > locks would be the best because the connections are often read and
> > > seldom modified.
> >
> > Often read by more than one thread ??? Read-write locks are only usefull
> > in very multi-threaded applications. connection/disconnection stuff
> > doesn't seem to fit into that ?
>
> Well, the connection implementation still needs to push data to more than
> one consumer ... and therefore needs to iterate on the connection lists
> (i.e. the list of consumers). That is reading. Modifying is seldom done.

And the number of threads involved ? Wouldn't the reading be done from only
one thread (the component's activity) in case of push and only one connection
object is read by one thread (the comm thread) in case of pull ?

Peter

[RFC] Redesign of the data flow interface

On Thursday 19 March 2009 09:02:01 Peter Soetens wrote:
> On Tuesday 10 March 2009 10:35:19 Sylvain Joyeux wrote:
> > What I'm trying to get across is that communication quality constraints
> > are to be seen from a *system level* point of view, not a module-level
> > point of view.
>
> That's the optimal case where there is a hierarchy of structured modules.
> That's the architecture we see in single robotic systems. Fine, I agree
> there. What if your components are truely only peer-to-peer and there is no
> mastering, ie all descions must be made locally ? I'm not going to be the
> swarm robotics zealot, but previous reasoning in our own 'rigid'
> architecture, delivered unflexible designs. I'd like to counter that a bit.
Maybe your that comes from the way your own "rigid" architecture worked. I
know mono-robot architectures that are actually very flexible *and* supervised.

You actually are going in my direction (thanks !). If there is no hierarchy,
the fact that one needs positions at 10Hz is not to be cared by the other. And
therefore, the sender cannot in general know what is the data quality from the
point of view of the receiver. It is to be cared by local (to each robot)
decision-making components. And each of these local components would be able
to decide whether a specific data flow channel is actually useful for the given
robot (and if not, notify the components that they cannot assume they will be
having data anymore on their ports. And they can do that by disconnecting the
said ports).

> > Of course, the sender does not have to know if its samples go through.
> > Still, I personally believe it is valuable to know that it is connected
> > to something, so that you can have the "connection state" of the
> > reader/writer ports into the configuration interface. Note that I do
> > agree that it is not enough (i.e. if you do have a connection set up but
> > no samples are going through, you're in trouble).
> >
> > Of course, you could have the receiver use timeout-based stuff to know
> > whether it has new samples or not.
> >
> > The problem is that, by doing that, you actually embed *in the receiver*
> > a metric that is part of a whole-system concern.
>
> Maybe you're more worried about embedding connection management code into a
> 'behavioral' component (mixing it with the algorithm) ? (I believe the four
> 'C's are bound to come up here). Such code indeed limits reusability, but
> only for pure algorithmic components, not for descision making components,
> which are already limited reusable (they are indeed system specific).
Yes. You are actually following my point here: the connections of your
computation components have to be managed by a decision-making component. Or
you are mixing both and therefore limit reusability.

On a related note: *indeed* decision-making components are reusable. A simple
case of a coordination component would be a "configuration management" model
(like auRa for instance) in which
- for each configuration the states of each modules, and the data links
between them are specified
- models telling how to modify the configuration based on the actual perceived
data (communication concerns and others).

The model is system-specific (but could even be made partially reusable through
model composition), the component would be generic.

> > More specifically, for instance, a SLAM TaskContext could be used "on
> > demand" on one system (low dynamics system, don't care about the actual
> > update rate, using reliable transports like TCP) and be constrained on
> > another (high dynamics robot, update rate must be above 5Hz, using UDP
> > transport with "smart" retransmission)
> >
> > An even more interesting behaviour would be to actually *change* how your
> > system behaves *based on the communication quality*. Again, using the
> > SLAM example, one could actually leave the SLAM alone (be purely
> > data-driven) and slowing down if localization is not following. Or stop
> > other CPU-demanding tasks that are not critical.
>
> I'm not sure if we're ready yet for optimisation scenarios.
That the point: *it is not the job of the RTT do to it*. It is the job of
decision making components that reconfigure the RTT layer and are not
necessarily RTT components themselves.

> > In my opinion, given that such a change (moving from high dynamics to low
> > dynamics) has obvious impacts on the rest of the system (like: will I be
> > at my goal on time now ?), the decision about how to solve the problem
> > should not be embedded in the components themselves ... Or those
> > components will become not reusable.
>
> Totally true, except that some components have the sole purpose to embed
> descisions.
Of course, I actually have one. And IMO if you are actually writing ad-hoc
decision-making components, this is soooooo 1992 ;-). Generic tools exist for
decision-making, I can tell you *use them if you can*.

> > > Although I fixed it like that in the same way, I wonder if the
> > > application builder really wants to bother. In connection-less setups,
> > > you can only have push.... keep that in mind.
> >
> > And waste your time marshalling/demarshalling for data you won't use
> > anyway.
>
> Yes, but it's a trade-off just like you're making them.
I'm not saying everybody should use that feature. The point being that you can
always that you are not connected and send data anyway. The other way around
is not possible.

> > > > My question is related to making the port-level
> > > > connection/disconnection code thread-safe (i.e. modifying the
> > > > internal implementation of the connection code in Port classes). I
> > > > thought R/W locks would be the best because the connections are often
> > > > read and seldom modified.
> > >
> > > Often read by more than one thread ??? Read-write locks are only
> > > usefull in very multi-threaded applications. connection/disconnection
> > > stuff doesn't seem to fit into that ?
> >
> > Well, the connection implementation still needs to push data to more than
> > one consumer ... and therefore needs to iterate on the connection lists
> > (i.e. the list of consumers). That is reading. Modifying is seldom done.
>
> And the number of threads involved ? Wouldn't the reading be done from only
> one thread (the component's activity) in case of push and only one
> connection object is read by one thread (the comm thread) in case of pull ?
Yes, but the modification can be initiated outside of these threads. And, for
now, it is possible to disconnect using only one side (i.e. say to the reader
"disconnect !" without knowing the writer) -- therefore the reader thread can
initiate a modification of the writer connection list.

That could be managed by using Orocos method objects, but I have the feeling
that using the LockFreeList as it is done right now is a good enough solution.

Sylvain

[RFC] Redesign of the data flow interface

On Thu, 19 Mar 2009, Sylvain Joyeux wrote:

> On Thursday 19 March 2009 09:02:01 Peter Soetens wrote:
>> On Tuesday 10 March 2009 10:35:19 Sylvain Joyeux wrote:
[...]
>> Totally true, except that some components have the sole purpose to embed
>> descisions.
> Of course, I actually have one. And IMO if you are actually writing ad-hoc
> decision-making components, this is soooooo 1992 ;-). Generic tools exist for
> decision-making, I can tell you *use them if you can*.

Can you provide some links, please, to these "generic tools for decision
making"?

Herman

[RFC] Redesign of the data flow interface

On Tue, 10 Mar 2009, Sylvain Joyeux wrote:

> What I'm trying to get across is that communication quality constraints are to
> be seen from a *system level* point of view, not a module-level point of view.

I fully agree with this point of view.

> Of course, the sender does not have to know if its samples go through. Still,
> I personally believe it is valuable to know that it is connected to something,
> so that you can have the "connection state" of the reader/writer ports into
> the configuration interface. Note that I do agree that it is not enough
> (i.e. if you do have a connection set up but no samples are going
> through, you're in trouble).

This implies some form of communication "handshake" _protocol_, which is
again the responsibility of the "system level" Coordination.

> Of course, you could have the receiver use timeout-based stuff to know whether
> it has new samples or not.
>
> The problem is that, by doing that, you actually embed *in the receiver* a
> metric that is part of a whole-system concern.
Indeed! Hence: bad idea :-) At leas at system level: a time-out _is_ a good
idea locally!

> More specifically, for instance, a SLAM TaskContext could be used "on demand"
> on one system (low dynamics system, don't care about the actual update rate,
> using reliable transports like TCP) and be constrained on another (high
> dynamics robot, update rate must be above 5Hz, using UDP transport with
> "smart" retransmission)
>
> An even more interesting behaviour would be to actually *change* how your
> system behaves *based on the communication quality*. Again, using the SLAM
> example, one could actually leave the SLAM alone (be purely data-driven) and
> slowing down if localization is not following. Or stop other CPU-demanding
> tasks that are not critical.

All these are certainly valid policies...

> In my opinion, given that such a change (moving from high dynamics to low
> dynamics) has obvious impacts on the rest of the system (like: will I be at my
> goal on time now ?), the decision about how to solve the problem should not be
> embedded in the components themselves ... Or those components will become not
> reusable.
I fully agree.

>> I would be inclined to say that if your application architecture has a 10ms
>> producer and a 500ms consumer, it's an application architecture thing you
>> need to fix, not something the communication framework needs to work
>> around.
> In my opinion, this is actually not a work around. Stop thinking "one
> producer, one consumer" and start thinking "one producer, multiple consumers".
> You can have a producer at 10ms (for instance laser scanner), a consumer at
> 10ms (local safety measure, for instance slowing down the robot if obstacles
> are too near), a consumer at 500ms (SLAM) and a consumer at 5s (on-demand
> update on a GUI for instance).

Yes.

>> Although I fixed it like that in the same way, I wonder if the
>> application builder really wants to bother. In connection-less setups, you
>> can only have push.... keep that in mind.
> And waste your time marshalling/demarshalling for data you won't use anyway.
>
>>> My question is related to making the port-level connection/disconnection
>>> code thread-safe (i.e. modifying the internal implementation of the
>>> connection code in Port classes). I thought R/W locks would be the best
>>> because the connections are often read and seldom modified.
>>
>> Often read by more than one thread ??? Read-write locks are only usefull in
>> very multi-threaded applications. connection/disconnection stuff doesn't
>> seem to fit into that ?
>
> Well, the connection implementation still needs to push data to more than one
> consumer ... and therefore needs to iterate on the connection lists (i.e. the
> list of consumers). That is reading. Modifying is seldom done.

And if Modifying _is_ needed, it is to be considered as a _separate_ data
producing component anyway...

> Sylvain

Herman

[RFC] Redesign of the data flow interface

> > Of course, the sender does not have to know if its samples go through.
> > Still, I personally believe it is valuable to know that it is connected
> > to something, so that you can have the "connection state" of the
> > reader/writer ports into the configuration interface. Note that I do
> > agree that it is not enough (i.e. if you do have a connection set up but
> > no samples are going through, you're in trouble).
>
> This implies some form of communication "handshake" _protocol_, which is
> again the responsibility of the "system level" Coordination.
I do agree. Setting up the communication is to be done at coordination level.

But I personally think that *knowing* a communication has been set up is part
of the configuration interface of the component itself. I.e. a component may
change what it computes, or even its behaviour, based on which ports are
connected or not.

Here, by "connection", I mean again that the handshake you are referring to
has been done, *not* that there is an actual way to get data through. It so
happens that in the case of the current CORBA layer, both are connected (but
they don't have to be).

Moreover, again in the current state of the CORBA layer, there is a "negative
information" case, which is "TCP connection broken". This case is semantically
interpreted *internally by the layer* into "the communication between the two
ports are broken". Why ? Because there is no way of automatic reconnection
implemented, so (1) the port needs to know they are not connected anymore and
(2) as long as the coordination level does not reconnect the two ports they
*is* no communication anymore.

> > Of course, you could have the receiver use timeout-based stuff to know
> > whether it has new samples or not.
> >
> > The problem is that, by doing that, you actually embed *in the receiver*
> > a metric that is part of a whole-system concern.
>
> Indeed! Hence: bad idea :-) At leas at system level: a time-out _is_ a good
> idea locally!
I'm not sure I get what you mean. For me, it is a good workaround for a lack
of Coordination layer or because the coordination layer cannot react "fast
enough" and therefore one needs to do it locally. Other than that, I think
avoiding embedded timeouts is good.

> >> Often read by more than one thread ??? Read-write locks are only usefull
> >> in very multi-threaded applications. connection/disconnection stuff
> >> doesn't seem to fit into that ?
> >
> > Well, the connection implementation still needs to push data to more than
> > one consumer ... and therefore needs to iterate on the connection lists
> > (i.e. the list of consumers). That is reading. Modifying is seldom done.
>
> And if Modifying _is_ needed, it is to be considered as a _separate_ data
> producing component anyway...
By "reading" and "modifying", I am referring to the "connection list" of the
write port. Again, this is an internal implementation thing but boils down to
that fact: if the write port needs to send to multiple targets, it has to
represent that set one way or the other.

To send samples, it iterates over the set, i.e. the set has to be read
To implement connection/disconnection, that set has to be modified, i.e. the
set has to be written.

Sylvain

[RFC] Redesign of the data flow interface

On Thu, 12 Mar 2009, Sylvain Joyeux wrote:

>>> Of course, the sender does not have to know if its samples go through.
>>> Still, I personally believe it is valuable to know that it is connected
>>> to something, so that you can have the "connection state" of the
>>> reader/writer ports into the configuration interface. Note that I do
>>> agree that it is not enough (i.e. if you do have a connection set up but
>>> no samples are going through, you're in trouble).
>>
>> This implies some form of communication "handshake" _protocol_, which is
>> again the responsibility of the "system level" Coordination.
> I do agree. Setting up the communication is to be done at coordination level.
>
> But I personally think that *knowing* a communication has been set up is part
> of the configuration interface of the component itself. I.e. a component may
> change what it computes, or even its behaviour, based on which ports are
> connected or not.

Yes! Your posts begin to read more and more as if I had written them myself :-)

> Here, by "connection", I mean again that the handshake you are referring to
> has been done, *not* that there is an actual way to get data through. It so
> happens that in the case of the current CORBA layer, both are connected (but
> they don't have to be).

> Moreover, again in the current state of the CORBA layer, there is a "negative
> information" case, which is "TCP connection broken". This case is semantically
> interpreted *internally by the layer* into "the communication between the two
> ports are broken". Why ? Because there is no way of automatic reconnection
> implemented, so (1) the port needs to know they are not connected anymore and
> (2) as long as the coordination level does not reconnect the two ports they
> *is* no communication anymore.
>
>>> Of course, you could have the receiver use timeout-based stuff to know
>>> whether it has new samples or not.
>>>
>>> The problem is that, by doing that, you actually embed *in the receiver*
>>> a metric that is part of a whole-system concern.
>>
>> Indeed! Hence: bad idea :-) At leas at system level: a time-out _is_ a good
>> idea locally!
> I'm not sure I get what you mean. For me, it is a good workaround for a lack
> of Coordination layer or because the coordination layer cannot react "fast
> enough" and therefore one needs to do it locally. Other than that, I think
> avoiding embedded timeouts is good.
I see no way how to avoid them altogether, at least not in real distributed
systems: for those systems, connections can be broken in such a way that
nothing else remains than "discovering" that problem locally.

>>>> Often read by more than one thread ??? Read-write locks are only usefull
>>>> in very multi-threaded applications. connection/disconnection stuff
>>>> doesn't seem to fit into that ?
>>>
>>> Well, the connection implementation still needs to push data to more than
>>> one consumer ... and therefore needs to iterate on the connection lists
>>> (i.e. the list of consumers). That is reading. Modifying is seldom done.
>>
>> And if Modifying _is_ needed, it is to be considered as a _separate_ data
>> producing component anyway...
> By "reading" and "modifying", I am referring to the "connection list" of the
> write port. Again, this is an internal implementation thing but boils down to
> that fact: if the write port needs to send to multiple targets, it has to
> represent that set one way or the other.
>
> To send samples, it iterates over the set, i.e. the set has to be read
> To implement connection/disconnection, that set has to be modified, i.e. the
> set has to be written.
>
> Sylvain

Herman

[RFC] Redesign of the data flow interface

On Tue, Mar 10, 2009 at 10:35:19AM +0100, Sylvain Joyeux wrote:
> What I'm trying to get across is that communication quality constraints are to
> be seen from a *system level* point of view, not a module-level point of view.
>
> Of course, the sender does not have to know if its samples go through. Still,
> I personally believe it is valuable to know that it is connected to something,
> so that you can have the "connection state" of the reader/writer ports into
> the configuration interface. Note that I do agree that it is not enough (i.e.
> if you do have a connection set up but no samples are going through, you're in
> trouble).
>
> Of course, you could have the receiver use timeout-based stuff to know whether
> it has new samples or not.
>
> The problem is that, by doing that, you actually embed *in the receiver* a
> metric that is part of a whole-system concern.
>
> More specifically, for instance, a SLAM TaskContext could be used "on demand"
> on one system (low dynamics system, don't care about the actual update rate,
> using reliable transports like TCP) and be constrained on another (high
> dynamics robot, update rate must be above 5Hz, using UDP transport with
> "smart" retransmission)
>
> An even more interesting behaviour would be to actually *change* how your
> system behaves *based on the communication quality*. Again, using the SLAM
> example, one could actually leave the SLAM alone (be purely data-driven) and
> slowing down if localization is not following. Or stop other CPU-demanding
> tasks that are not critical.
>
> In my opinion, given that such a change (moving from high dynamics to low
> dynamics) has obvious impacts on the rest of the system (like: will I be at my
> goal on time now ?), the decision about how to solve the problem should not be
> embedded in the components themselves ... Or those components will become not
> reusable.

Everybody is more and more agreeing... :-) I think this is what Herman
calls "Coordination".

> > I would be inclined to say that if your application architecture has a 10ms
> > producer and a 500ms consumer, it's an application architecture thing you
> > need to fix, not something the communication framework needs to work
> > around.
> In my opinion, this is actually not a work around. Stop thinking "one
> producer, one consumer" and start thinking "one producer, multiple consumers".
> You can have a producer at 10ms (for instance laser scanner), a consumer at
> 10ms (local safety measure, for instance slowing down the robot if obstacles
> are too near), a consumer at 500ms (SLAM) and a consumer at 5s (on-demand
> update on a GUI for instance).

Agreed that such filtering would be very usefull to have. My only
doubt is if these advanced port mechanisms would not be better
implemented as one or more components. I have a feeling we should
build applications more hierarchically and build higher level
components from sub-components. This will allow us to much more
simplify simple component interfaces.

Regards
Markus

[RFC] Redesign of the data flow interface

On Thu, 5 Mar 2009, Sylvain Joyeux wrote:

>> Sounds good. I hope we can keep the policy code as much as possible out of
>> the components, because much will depend on the environment in which the
>> component will be deployed. But a default won't hurt in most cases.
> The policy code *is* out of the components. There is only a default policy,
> everything else is set up at runtime.
>
>>> * CORBA not implemented. What I will have in the end is a CORBA
>>> implementation that is on-par with the C++ one: both pull or push
>>> models available, proper connection management, data signalling (the new
>>> "event ports" will work over CORBA)
>>
>> Up to which point would that be compatible/comparable with the CORBA event
>> service ?
> I don't plan to use the event service, as you said yourself you had too much
> problems setting it up. I plan to use plain method calls, but I was thinking
> to decouple the task contexts from the CORBA method call by using a separate
> TaskContext that does the message passing between different processes.

I think this is a "Good Practice"... Probably we need it in many more cases
where one _really_ wants the "network and process transparency" offered by
communication middleware such as CORBA.

>> How do you feel about a UDP based 'streaming' data flow protocol
>> without data arrival guarantees (send and forget) ?
> I feel bad about it. For two reasons: first, one would have to implement it
> (which is a lot of work). Second, that is IMO only valid if there is a way to
> supervise the communication layer, which means that we would need to write a
> full communication framework. Not practical if we don't have someone full time
> doing that.
Indeed. Let's leave this to the communication middleware projects. (Too)
many of them _are_ already doing this, by the way... (In an
non-interoperable way, sigh.)

>>> * thread-safety is not properly taken care of. I'd love to have a proper
>>> R/W lock for connection/disconnection operations, but there is none in
>>> RTT itself. Would there be a problem if I was using the boost
>>> implementation ?
>>
>> Locks are fine, but we might need to map to the correct API on LXRT and
>> Xenomai. Although they do allow posix locks in non-real-time threads, which
>> would allow boost locks in these cases.
> Mmmm... Would it be possible to get a Posix-Xenomai-LXRT R/W lock
> implementation in Orocos then ? That would avoid a lot of contention problems.
> Or would you have other means/ideas about how to do that ?
>
>>> The ugly
>>> -------------
>>> * compilation times seem to be on par with the old implementation (no
>>> improvement here)
>>
>> There is/may be a solution for that one. The type system is actually the
>> key. it already allows to construct objects of type T. We could extend it
>> and only 'generate' the template based code in the toolkits, instead of
>> each possible component, which is now the case. We have a tremendous amount
>> of binary code duplication because of the templates. A factory system could
>> solve a lot, but then this would require everyone to define his own toolkit
>> (which isn't that hard nowadays, but could use improvements).
> I thought about it. What I don't like is that you would *need* to have a
> toolkit definition for all types that passes through ports. It is actually
> cool, right now, that you can pass whatever type you want internally. Toolkits
> are only needed when you get through CORBA or inside property files.

I would replace "CORBA" here with the more general use case of "network and
process transparant services", but I agree with the core of the statement.
_If_ you have an application in which you know certain things about how
different activities are scheduled, you should be able to make use of this
knowledge and configure the most efficient implementation.

Herman

[RFC] Redesign of the data flow interface

On Wed, Mar 04, 2009 at 06:32:41PM +0100, Sylvain Joyeux wrote:

...
> Current status
> =========
>
> The good
> --------------
> * one-writer, multiple-readers model. Multiple ports can read one single
> write port, but there is no way for multiple write ports to write on the
> same read port.

Ermh, why is this good? Wouldn't that be a useful?

> * no weird behaviour in case of multiple buffer connections that read the same
> port.

I guess you mean it's not "load-balancing", but distibution to all
now? Ok.

> * runtime connection policies. The parameters for each connection is set at
> runtime. A default policy can be specified at the read port level in order
> to simplify component usage (if you don't know what the component needs,
> then just use its default policy)

Care to explain what it's about?

> * whatever the policy is, one can know if a sample has ever been written on
> it. I.e. no need to have a "default value" that means "no data" anymore.
> * proper connection management: if one side disconnects, then the other side
> knows that nobody is connected to them anymore.
> * a write ports can optionally "remember" what was the last sample
> written on them. If that is enabled, the policy allows to initialize
> new connections with the last value written on the port (if there is one).

Such corner cases are nasty but I guess this old cruft?

> The bad
> --------------
> * CORBA not implemented. What I will have in the end is a CORBA
> implementation that is on-par with the C++ one: both pull or push models
> available, proper connection management, data signalling (the new "event
> ports" will work over CORBA)

As long it doesn't increase the dependecy on CORBA.

...
> Any constructive comment is welcome

Thanks for doing this cleanup!

I'm thinking more and more that AMP-style messages, RTT::Events and
also Ports are really the same thing. Buffering should (in the long
run) become a Configuration (Deployment) level property. Unifying
these concepts would allow for a great deal of simplification.

Best regards
Markus

[RFC] Redesign of the data flow interface

> > * one-writer, multiple-readers model. Multiple ports can read one single
> > write port, but there is no way for multiple write ports to write on
> > the same read port.
>
> Ermh, why is this good? Wouldn't that be a useful?
It is not a "good" thing (I put it in the wrong category). It is actually what
it is. While I do see the usefulness of load-balancing using connections, I
personally thing it is a corner case. Moreover, I have the feeling that,
anyway, one needs a separate component to reorder samples after they have been
processed (so you can have a component doing the load balancing as well ...).

Let me explain: let's assume that there is two components A and B processing
the same type of data. Currently, you do (I'm guessing here):

src_buffer => (A or B) => dst_buffer

Okay, but how do you make sure that dst_buffer ends up in the same order than
src_buffer ? You can't ...

> > * no weird behaviour in case of multiple buffer connections that read
> > the same port.
>
> I guess you mean it's not "load-balancing", but distibution to all
> now? Ok.
Yes.

> > * runtime connection policies. The parameters for each connection is set
> > at runtime. A default policy can be specified at the read port level in
> > order to simplify component usage (if you don't know what the component
> > needs, then just use its default policy)
>
> Care to explain what it's about?
This is about that ports are only read and write ports. No string attached to
them (like buffer, data, size of buffers, lock type, ...).

Then, at connection, you specify how you want each specific connection to be
done. Given one write port A and three read ports rB, and rC and rD, you can
build the following three connections:
A => rB is data connection, using mutexes and pushing data
A => rC is data connection, using lock-free and pulling data
A => rD is buffered connection of size 10 using lock-free and pushing data

Other stuff that I would imagine being nice is time-based filtering (let pass
one sample at most every 10ms) or sample-based filtering (let pass 1/3 of the
samples).

> > * whatever the policy is, one can know if a sample has ever been written
> > on it. I.e. no need to have a "default value" that means "no data"
> > anymore. * proper connection management: if one side disconnects, then
> > the other side knows that nobody is connected to them anymore.
> > * a write ports can optionally "remember" what was the last sample
> > written on them. If that is enabled, the policy allows to initialize
> > new connections with the last value written on the port (if there is
> > one).
>
> Such corner cases are nasty but I guess this old cruft?
Which "corner cases" ? Remembering the last written sample is actually useful
for debugging (and was asked in the initial discussion). Initializing new
connections ... well ... I added it because it was simple to do and I had the
feeling some people would want it.

> > * CORBA not implemented. What I will have in the end is a CORBA
> > implementation that is on-par with the C++ one: both pull or push
> > models available, proper connection management, data signalling (the new
> > "event ports" will work over CORBA)
>
> As long it doesn't increase the dependecy on CORBA.
CORBA remains an optional dependency.

Sylvain

[RFC] Redesign of the data flow interface

On Thu, Mar 05, 2009 at 11:40:10AM +0100, Sylvain Joyeux wrote:
> > > * one-writer, multiple-readers model. Multiple ports can read one single
> > > write port, but there is no way for multiple write ports to write on
> > > the same read port.
> >
> > Ermh, why is this good? Wouldn't that be a useful?
> It is not a "good" thing (I put it in the wrong category). It is actually what
> it is. While I do see the usefulness of load-balancing using connections, I
> personally thing it is a corner case. Moreover, I have the feeling that,
> anyway, one needs a separate component to reorder samples after they have been
> processed (so you can have a component doing the load balancing as well ...).

Ok, that seems reasonable.

> Let me explain: let's assume that there is two components A and B processing
> the same type of data. Currently, you do (I'm guessing here):
>
> src_buffer => (A or B) => dst_buffer
>
> Okay, but how do you make sure that dst_buffer ends up in the same order than
> src_buffer ? You can't ...

Yes, trying emulate this would be very wrong IMO.

> > > * runtime connection policies. The parameters for each connection is set
> > > at runtime. A default policy can be specified at the read port level in
> > > order to simplify component usage (if you don't know what the component
> > > needs, then just use its default policy)
> >
> > Care to explain what it's about?
> This is about that ports are only read and write ports. No string attached to
> them (like buffer, data, size of buffers, lock type, ...).
>
> Then, at connection, you specify how you want each specific connection to be
> done. Given one write port A and three read ports rB, and rC and rD, you can
> build the following three connections:
> A => rB is data connection, using mutexes and pushing data
> A => rC is data connection, using lock-free and pulling data

Can you explain what the difference between pushing and pulling means
in this context?

> A => rD is buffered connection of size 10 using lock-free and pushing data
>
> Other stuff that I would imagine being nice is time-based filtering (let pass
> one sample at most every 10ms) or sample-based filtering (let pass 1/3 of the
> samples).

Wouldn't these be better moved into seperate component?

> > > * whatever the policy is, one can know if a sample has ever been written
> > > on it. I.e. no need to have a "default value" that means "no data"
> > > anymore. * proper connection management: if one side disconnects, then
> > > the other side knows that nobody is connected to them anymore.
> > > * a write ports can optionally "remember" what was the last sample
> > > written on them. If that is enabled, the policy allows to initialize
> > > new connections with the last value written on the port (if there is
> > > one).
> >
> > Such corner cases are nasty but I guess this old cruft?
> Which "corner cases" ? Remembering the last written sample is actually useful
> for debugging (and was asked in the initial discussion). Initializing new

Hmm, I maybe I wasn't there at that time. This seems to be a somewhat
crude debugging mechanism however? Given your new "multicast" ports
wouldn't it be easier and more powerful to simply attach a debugging
component to a port for this purpose?

> connections ... well ... I added it because it was simple to do and I had the
> feeling some people would want it.

I must admit to me this seems somewhat awkward. Being a Unix guy I
can't help of thinking of pipes and fifo, O_NONBLOCK and poll and
select. I think we shouln't hesitate to steal these mechanisms and not
try to invent new ones which do the same thing. But maybe we _are_
trying to do something different here and I'll be pleased to learn :-)

> > > * CORBA not implemented. What I will have in the end is a CORBA
> > > implementation that is on-par with the C++ one: both pull or push
> > > models available, proper connection management, data signalling (the new
> > > "event ports" will work over CORBA)
> >
> > As long it doesn't increase the dependecy on CORBA.
> CORBA remains an optional dependency.

OK, great!

Best regards
Markus

[RFC] Redesign of the data flow interface

> > > > * runtime connection policies. The parameters for each connection is
> > > > set at runtime. A default policy can be specified at the read port
> > > > level in order to simplify component usage (if you don't know what
> > > > the component needs, then just use its default policy)
> > >
> > > Care to explain what it's about?
> >
> > This is about that ports are only read and write ports. No string
> > attached to them (like buffer, data, size of buffers, lock type, ...).
> >
> > Then, at connection, you specify how you want each specific connection to
> > be done. Given one write port A and three read ports rB, and rC and rD,
> > you can build the following three connections:
> > A => rB is data connection, using mutexes and pushing data
> > A => rC is data connection, using lock-free and pulling data
>
> Can you explain what the difference between pushing and pulling means
> in this context?
In a single-process context, no difference. In a multi-process connection,
"pushing" means that the sample will be pushed over the connection when
written. "Pulling" means that it will get over the connection only when the
reader calls read(). I think that it is useful, but it is mainly a mean, for
me, to test that the interface does support this kind of features.

Actually, now that I think of it, it should probably be removed from the
ConnPolicy class and made a CORBA-specific configuration option.

> > A => rD is buffered connection of size 10 using lock-free and pushing
> > data
> >
> > Other stuff that I would imagine being nice is time-based filtering (let
> > pass one sample at most every 10ms) or sample-based filtering (let pass
> > 1/3 of the samples).
>
> Wouldn't these be better moved into seperate component?
Probably. It was an example really.

> > > > * whatever the policy is, one can know if a sample has ever been
> > > > written on it. I.e. no need to have a "default value" that means "no
> > > > data" anymore. * proper connection management: if one side
> > > > disconnects, then the other side knows that nobody is connected to
> > > > them anymore. * a write ports can optionally "remember" what was the
> > > > last sample written on them. If that is enabled, the policy allows to
> > > > initialize new connections with the last value written on the port
> > > > (if there is one).
> > >
> > > Such corner cases are nasty but I guess this old cruft?
> >
> > Which "corner cases" ? Remembering the last written sample is actually
> > useful for debugging (and was asked in the initial discussion).
> > Initializing new
>
> Hmm, I maybe I wasn't there at that time. This seems to be a somewhat
> crude debugging mechanism however? Given your new "multicast" ports
> wouldn't it be easier and more powerful to simply attach a debugging
> component to a port for this purpose?

Probably, yes. Don't get me wrong, I actually plan to not use those features
;-)

> > connections ... well ... I added it because it was simple to do and I had
> > the feeling some people would want it.
>
> I must admit to me this seems somewhat awkward. Being a Unix guy I
> can't help of thinking of pipes and fifo, O_NONBLOCK and poll and
> select. I think we shouln't hesitate to steal these mechanisms and not
> try to invent new ones which do the same thing. But maybe we _are_
> trying to do something different here and I'll be pleased to learn :-)
I don't see the link. We already have poll() and select() on a per-port basis,
and both read ports and write ports are non-blocking. Nothing new really.

The usefulness of connection initialization is when one considers a state
estimate (say: position). It would actually make sense IMO to have a position-
reader get the latest position at connection initialization, rather than
having to wait for the next update.

[RFC] Redesign of the data flow interface

On Wednesday 04 March 2009 18:32:41 Sylvain Joyeux wrote:
> Hello everyone ...
>
> A while back I talked about redesigning the data flow part of the RTT... So
> here is the starting point.

It seemed yesterday :-)

>
> Current status
> =========
>
> The good
> --------------
> * one-writer, multiple-readers model. Multiple ports can read one single
> write port, but there is no way for multiple write ports to write on
> the same read port.
> * no weird behaviour in case of multiple buffer connections that read the
> same port.
> * runtime connection policies. The parameters for each connection is set
> at runtime. A default policy can be specified at the read port level in
> order to simplify component usage (if you don't know what the component
> needs, then just use its default policy)
> * whatever the policy is, one can know if a sample has ever been written
> on it. I.e. no need to have a "default value" that means "no data" anymore.
> * proper connection management: if one side disconnects, then the other
> side knows that nobody is connected to them anymore.
> * a write ports can optionally "remember" what was the last sample
> written on them. If that is enabled, the policy allows to initialize
> new connections with the last value written on the port (if there is
> one).

Sounds good. I hope we can keep the policy code as much as possible out of the
components, because much will depend on the environment in which the component
will be deployed. But a default won't hurt in most cases.

>
> The bad
> --------------
> * CORBA not implemented. What I will have in the end is a CORBA
> implementation that is on-par with the C++ one: both pull or push models
> available, proper connection management, data signalling (the new "event
> ports" will work over CORBA)

Up to which point would that be compatible/comparable with the CORBA event
service ? How do you feel about a UDP based 'streaming' data flow protocol
without data arrival guarantees (send and forget) ?

> * thread-safety is not properly taken care of. I'd love to have a proper
> R/W lock for connection/disconnection operations, but there is none in RTT
> itself. Would there be a problem if I was using the boost implementation ?

Locks are fine, but we might need to map to the correct API on LXRT and
Xenomai. Although they do allow posix locks in non-real-time threads, which
would allow boost locks in these cases.

> * no backward compatibility. That could be implemented easily though.

We'll have to see which courses can be taken to make the transition as
smoothly (ie automatically) as possible for every user.

>
> The ugly
> -------------
> * compilation times seem to be on par with the old implementation (no
> improvement here)

There is/may be a solution for that one. The type system is actually the key.
it already allows to construct objects of type T. We could extend it and only
'generate' the template based code in the toolkits, instead of each possible
component, which is now the case. We have a tremendous amount of binary code
duplication because of the templates. A factory system could solve a lot, but
then this would require everyone to define his own toolkit (which isn't that
hard nowadays, but could use improvements).

>
> Checking it out
> =========
>
> The code is the new_data_flow branch of orocos-rtt on my github account
> http://github.com/doudou/orocos-rtt/commits/new_data_flow
>
> It is based on Peter's master of the day before tomorrow.
> See the testPort* methods in tests/generictask_test_3.cpp for examples.
>
> To peter: please, PLEASE, *PLEASE* don't push that branch on SVN. I'd like
> to keep control on this code (meaning: not having to rebase every two days
> to keep up with your updates).
>
> Any constructive comment is welcome

I cloned your git repository. I'll submit patches to you such that you can
decide how/when stuff gets merged. There's a rtt-2.0 branch on github that I'm
preparing which will eventually merge your contributions. I don't plan to use
SVN for 2.0. If your branch would be merged before summer, that would be
great.

Time to take a look at the code !

Peter

[RFC] Redesign of the data flow interface

> > * compilation times seem to be on par with the old implementation (no
> > improvement here)
>
> There is/may be a solution for that one. The type system is actually the
> key. it already allows to construct objects of type T. We could extend it
> and only 'generate' the template based code in the toolkits, instead of
> each possible component, which is now the case. We have a tremendous amount
> of binary code duplication because of the templates. A factory system could
> solve a lot, but then this would require everyone to define his own toolkit
> (which isn't that hard nowadays, but could use improvements).

We could get away with it by doing the following:
* have a MPL function returning "false" by default
* have toolkit define a header where this function is "true" for the types
supported by the toolkit.

Then, one can include the appropriate toolkit header in its task context
definition files, in which case RTT selects at compile-time a method that only
calls the TypeInfo for connection creation (and therefore, no more emission of
templates).

Of course, it does not work if the header is included by the plugin does not
get loaded at runtime. Therefore, we would also need a way that shared
libraries load plugins (and make sure the plugin is loaded at most once at
runtime). Then, you could:
* have a shared library #include the appropriate header from the toolkit
* that header would both define the appropriate MPL functions *and* make sure
the toolkit is loaded whenever that shared library is used.

Sylvain

[RFC] Redesign of the data flow interface

On Monday 09 March 2009 10:55:41 Sylvain Joyeux wrote:
> > > * compilation times seem to be on par with the old implementation (no
> > > improvement here)
> >
> > There is/may be a solution for that one. The type system is actually the
> > key. it already allows to construct objects of type T. We could extend it
> > and only 'generate' the template based code in the toolkits, instead of
> > each possible component, which is now the case. We have a tremendous
> > amount of binary code duplication because of the templates. A factory
> > system could solve a lot, but then this would require everyone to define
> > his own toolkit (which isn't that hard nowadays, but could use
> > improvements).
>
> We could get away with it by doing the following:
> * have a MPL function returning "false" by default
> * have toolkit define a header where this function is "true" for the
> types supported by the toolkit.

Perfect.

>
> Then, one can include the appropriate toolkit header in its task context
> definition files, in which case RTT selects at compile-time a method that
> only calls the TypeInfo for connection creation (and therefore, no more
> emission of templates).
>
> Of course, it does not work if the header is included by the plugin does
> not get loaded at runtime. Therefore, we would also need a way that shared
> libraries load plugins (and make sure the plugin is loaded at most once at
> runtime). Then, you could:
> * have a shared library #include the appropriate header from the toolkit
> * that header would both define the appropriate MPL functions *and* make
> sure the toolkit is loaded whenever that shared library is used.

Very possible. It's clear that the plugin loading must move from
OCL/Deployment to RTT.

Peter

[RFC] Redesign of the data flow interface

On Wed, 4 Mar 2009, Sylvain Joyeux wrote:

> Hello everyone ...
>
> A while back I talked about redesigning the data flow part of the RTT... So
> here is the starting point.
Thanks!!

I add some discussion remarks, that are basically trying to make clear
which of the four concerns (Coordination, Computation, Communication and
Configuration) should be tackling your suggestions/remarks. My personal
conclusion from this "home work" is that orocos definitely needs a separate
Coordination branch...

> Current status
> =========
>
> The good
> --------------
> * one-writer, multiple-readers model. Multiple ports can read one single
> write port, but there is no way for multiple write ports to write on the
> same read port.
This is a "Coordination" concern, that should not be left to the
Communications implementation only. In my opinion, your second policy needs
an _activity_ that takes care of a (deterministic/guaranteed)
implementation of this policy; it also requires a non-trivial
Configuration. So, I think this (desirable!) extension should not be done
in just an communication object library only.

> * no weird behaviour in case of multiple buffer connections that read the same
> port.
> * runtime connection policies. The parameters for each connection is set at
> runtime. A default policy can be specified at the read port level in order
> to simplify component usage (if you don't know what the component needs,
> then just use its default policy)
> * whatever the policy is, one can know if a sample has ever been written on
> it. I.e. no need to have a "default value" that means "no data" anymore.
> * proper connection management: if one side disconnects, then the other side
> knows that nobody is connected to them anymore.

I don't like this direct coupling between both communicating partners...

> * a write ports can optionally "remember" what was the last sample
> written on them. If that is enabled, the policy allows to initialize
> new connections with the last value written on the port (if there is one).
This, in my opinion, belongs to "Coordination", not in the Communication
implementation.

> The bad
> --------------
> * CORBA not implemented. What I will have in the end is a CORBA
> implementation that is on-par with the C++ one: both pull or push models
> available, proper connection management, data signalling (the new "event
> ports" will work over CORBA)
CORBA _implementations_ should remain _outside_ of the Orocos source tree!
Configuration _could_ be part of it, but then in a branch of the source
tree that can be configured out.

> * thread-safety is not properly taken care of. I'd love to have a proper R/W
> lock for connection/disconnection operations, but there is none in RTT
> itself. Would there be a problem if I was using the boost implementation ?
Again, this belongs in Coordination. Is the boost implementation designed
to be used in a multi-component environment? (That is, does it take
Coordination into account?) Or is it "just" Communication level classes?

> * no backward compatibility. That could be implemented easily though.
If it could be done by only Configuration, that would be very nice...

> The ugly
> -------------
> * compilation times seem to be on par with the old implementation (no
> improvement here)
Compilation of what exactly?

> Checking it out
> =========
>
> The code is the new_data_flow branch of orocos-rtt on my github account
> http://github.com/doudou/orocos-rtt/commits/new_data_flow
>
> It is based on Peter's master of the day before tomorrow.
> See the testPort* methods in tests/generictask_test_3.cpp for examples.
>
> To peter: please, PLEASE, *PLEASE* don't push that branch on SVN. I'd like to
> keep control on this code (meaning: not having to rebase every two days to
> keep up with your updates).
>
> Any constructive comment is welcome

I hope to have served this purpose :-)

> Sylvain

Herman

[RFC] Redesign of the data flow interface

> > * one-writer, multiple-readers model. Multiple ports can read one single
> > write port, but there is no way for multiple write ports to write on
> > the same read port.
>
> This is a "Coordination" concern, that should not be left to the
> Communications implementation only. In my opinion, your second policy needs
> an _activity_ that takes care of a (deterministic/guaranteed)
> implementation of this policy; it also requires a non-trivial
> Configuration. So, I think this (desirable!) extension should not be done
> in just an communication object library only.
You will have to rephrase this, because these sentences are way too obscure.

> > * no weird behaviour in case of multiple buffer connections that read the
> > same port.
> > * runtime connection policies. The parameters for each connection is set
> > at runtime. A default policy can be specified at the read port level in
> > order to simplify component usage (if you don't know what the component
> > needs, then just use its default policy)
> > * whatever the policy is, one can know if a sample has ever been written
> > on it. I.e. no need to have a "default value" that means "no data"
> > anymore. * proper connection management: if one side disconnects, then
> > the other side knows that nobody is connected to them anymore.
>
> I don't like this direct coupling between both communicating partners...
Well. A connection (i.e. a communication in Hermanspeak) is a two-partner
thing. I don't see how one component can talk to the void (which it is doing
right now). So, yes, when one stops talking to the other that HAS an effect on
the other.

> > * a write ports can optionally "remember" what was the last sample
> > written on them. If that is enabled, the policy allows to initialize
> > new connections with the last value written on the port (if there is
> > one).
>
> This, in my opinion, belongs to "Coordination", not in the Communication
> implementation.
Explain more. Why ?

>
> > The bad
> > --------------
> > * CORBA not implemented. What I will have in the end is a CORBA
> > implementation that is on-par with the C++ one: both pull or push
> > models available, proper connection management, data signalling (the new
> > "event ports" will work over CORBA)
>
> CORBA _implementations_ should remain _outside_ of the Orocos source tree!
> Configuration _could_ be part of it, but then in a branch of the source
> tree that can be configured out.
Yes. Of course. And the CORBA implementation does not know the new data ports.

> > * thread-safety is not properly taken care of. I'd love to have a proper
> > R/W lock for connection/disconnection operations, but there is none in
> > RTT itself. Would there be a problem if I was using the boost
> > implementation ?
>
> Again, this belongs in Coordination. Is the boost implementation designed
> to be used in a multi-component environment? (That is, does it take
> Coordination into account?) Or is it "just" Communication level classes?
Grmbl. You are mixing concept-level and implementation-level. Orocos is
inherently a multi-threaded environment, so the C++ objects that make Orocos
components need to be thread-safe. Period.

> > * no backward compatibility. That could be implemented easily though.
>
> If it could be done by only Configuration, that would be very nice...
Not possible, as the current implementation hard-codes stuff that is now part
of the policy environment (i.e. the configuration of the communication).

Honestly, Hermann, I have the feeling that you really want with your concepts
to make the whole implementation unmanageable. Communication between
components need to be Configured (what I call communication policies) and also
need to be Coordinated (to allow modifications by different components). Fine.

Now, what I would agree to is the following. To unify, in the code, the
classes in terms of Coordination and Configuration. For instance, stop calling
the communication policy ConnPolicy but call it ConnConfiguration (or
whatever).

what I will NEVER agree to is to separate everything just so that it maps your
mindset. The communication layer (which, in this case, is limited to data
ports), needs to be configured and needs to be (internally) coordinated. That
part of the coordination and configuration HAS TO BE INTEGRATED into the
communication layer. Otherwise, you will end up having an overengineered piece
of crap.

Sylvain

[RFC] Redesign of the data flow interface

On Thu, 5 Mar 2009, Sylvain Joyeux wrote:

>>> * one-writer, multiple-readers model. Multiple ports can read one single
>>> write port, but there is no way for multiple write ports to write on
>>> the same read port.
>>
>> This is a "Coordination" concern, that should not be left to the
>> Communications implementation only. In my opinion, your second policy needs
>> an _activity_ that takes care of a (deterministic/guaranteed)
>> implementation of this policy; it also requires a non-trivial
>> Configuration. So, I think this (desirable!) extension should not be done
>> in just an communication object library only.
> You will have to rephrase this, because these sentences are way too obscure.

Yes, they require quite some "background knowledge" about the four
"concerns" of Coordination, Configuration, Computation and Communication
with which the design of complex software systems can be structured in a
nice conceptual way. More details can be found in the work of Radestock and
Eisenbach:
Radestock, Matthias and Eisenbach, Susan
Coordination in evolving systems
Trends in Distributed Systems CORBA and Beyond, 162-176, 1996

I can send you an electronic copy off line, if you want.

>>> * no weird behaviour in case of multiple buffer connections that read the
>>> same port.
>>> * runtime connection policies. The parameters for each connection is set
>>> at runtime. A default policy can be specified at the read port level in
>>> order to simplify component usage (if you don't know what the component
>>> needs, then just use its default policy)
>>> * whatever the policy is, one can know if a sample has ever been written
>>> on it. I.e. no need to have a "default value" that means "no data"
>>> anymore. * proper connection management: if one side disconnects, then
>>> the other side knows that nobody is connected to them anymore.
>>
>> I don't like this direct coupling between both communicating partners...
> Well. A connection (i.e. a communication in Hermanspeak) is a two-partner
> thing. I don't see how one component can talk to the void (which it is doing
> right now). So, yes, when one stops talking to the other that HAS an effect on
> the other.

One of the "best practices" in making Communication work in complex and
dynamically evolving systems is to prevent components to talk to other
components directly (in the sense that they (have to) know each other), but
to use a "name server mediator" in between. This complexity is only needed
for the more dynamic and complex systems, not for most of the current robot
control systems.

BTW, CORBA has provisions for these more complex systems, but most current
users only deal with the simpler ones, and hence find these CORBA concepts
"bloat"... I tend to disagree, as you have undoubtedly already notices, but
I do care about having the configuration flexibility into Orocos that
allows developers of "simple" systems to have to use only the relevant
concepts and primitives, without having to worry about the "bloar" required
in the more complex and dynamic systems. Finding a good framework design to
satisfy both needs (and a couple more needs in addition!) is though, but I
think Orocos should try to do that effort...

>>> * a write ports can optionally "remember" what was the last sample
>>> written on them. If that is enabled, the policy allows to initialize
>>> new connections with the last value written on the port (if there is
>>> one).
>>
>> This, in my opinion, belongs to "Coordination", not in the Communication
>> implementation.
> Explain more. Why ?
Because every time you use the word "policy" it means automatically (at
least in the above-mentioned "4C" conceptual view on complex systems...)
that this feature should be provided by the Configuration (static solution,
with possibly direct knowledge of the communicating partners) and/or
Coordination (if dynamic aspects are involved, such as coordination of a
state-full communication protocol), and not by the Computation or
Communication parts..

>>> The bad
>>> --------------
>>> * CORBA not implemented. What I will have in the end is a CORBA
>>> implementation that is on-par with the C++ one: both pull or push
>>> models available, proper connection management, data signalling (the new
>>> "event ports" will work over CORBA)
>>
>> CORBA _implementations_ should remain _outside_ of the Orocos source tree!
>> Configuration _could_ be part of it, but then in a branch of the source
>> tree that can be configured out.
> Yes. Of course. And the CORBA implementation does not know the new data ports.

>>> * thread-safety is not properly taken care of. I'd love to have a proper
>>> R/W lock for connection/disconnection operations, but there is none in
>>> RTT itself. Would there be a problem if I was using the boost
>>> implementation ?
>>
>> Again, this belongs in Coordination. Is the boost implementation designed
>> to be used in a multi-component environment? (That is, does it take
>> Coordination into account?) Or is it "just" Communication level classes?
> Grmbl. You are mixing concept-level and implementation-level. Orocos is
> inherently a multi-threaded environment, so the C++ objects that make Orocos
> components need to be thread-safe. Period.

Not just "Period."! Also in the internal architecture of Orocos we'd rather
be sensitive to the "right" design patterns, and multi-threading is more
than just "thread safeness"... For example, one of the most difficult
things to get right is to get a multi-threaded application started, paused,
and stopped in a deterministic way; this _is_ Coordination, which goes
significantly beyond R/W locks... And Configuration is also useful to
consider, since it might well be possible that more than one
"multi-threading policy" should be provided.

>>> * no backward compatibility. That could be implemented easily though.
>>
>> If it could be done by only Configuration, that would be very nice...
> Not possible, as the current implementation hard-codes stuff that is now part
> of the policy environment (i.e. the configuration of the communication).
>
> Honestly, Hermann, I have the feeling that you really want with your concepts
> to make the whole implementation unmanageable.

On the contrary! By taking a lot of care that your _design_ is
as decoupled ("separation of concerns") as possible, the _implementation_
of the classes involved in the design will become easier. At least in the
medium and long term, while not necessarily so in the short (refactoring)
term. I find it my duty to keep on warning for this trade-off all the time
:-)

> Communication between components need to be Configured (what I call
> communication policies) and also need to be Coordinated (to allow
> modifications by different components). Fine.
Fine. But Coordination could also be involved in the runtime handling of
statefull communication protocols. I don't know whether we (want to) have
some of those in the near future... The only thing I currently see coming
close to this is the Command: this is really an asynchronous communication
primitive that typically involves 'state' information to be shared by both
"communicating" components...

> Now, what I would agree to is the following. To unify, in the code, the
> classes in terms of Coordination and Configuration. For instance, stop calling
> the communication policy ConnPolicy but call it ConnConfiguration (or
> whatever).

Having a uniform naming policy would indeed be great! Other developers
should jump in at this point, and identify other renaming suggestions that
can make all of Orocos semantically more consistent...

> what I will NEVER agree to is to separate everything just so that it maps
> your mindset. The communication layer (which, in this case, is limited to
> data ports), needs to be configured and needs to be (internally)
> coordinated. That part of the coordination and configuration HAS TO BE
> INTEGRATED into the communication layer. Otherwise, you will end up
> having an overengineered piece of crap.

That's up to you to prove then... :-) I would also suggest not to get
excited and become personal and start calling things "crap". Please stick
to rational arguments. I make my suggestions not because I want everybody
to agree to my "mindset", but because I am convinced by a number of
arguments that it _is_ worthwhile to take a much broader view at this
moment in the lifetime of Orocos, because it will prevent us from making
coupling mistakes that will be difficult to redo, and that make
understanding of the Orocos primitives more difficult.

And to pick in on what you say: there is no inherent conflict between
decoupling and integration! The key is the difference between "integration
in" and "integration with"; the latter separates both "modules to be
integrated" to the largest extent possible, allowing for more flexibility
and conceptual clarity. There is slso no inherent conflict between
decoupling and integration at the implementation level, although I know
that implementations will be simpler (in the short term) if the developer
doesn't have to worry about the (much) larger picture... Please have a look
at most of the Microsoft "crap" to notice that they _deliberately_ pursue
integration _without_ full decoupling, with the sole reason of making
interoperability difficult. I am refering to software such as Sharepoint,
Exchange, and SqlServer (and more and more also .NET) that can only do
their integrated work together without allowing third-party replacement of
any of these components; much of this "monolitic coupling" has to do with
the lack of separation between communication, configuration and
coordination. The result is often a significant overhead at runtime (via
the extra efforts of the sysadmins that have to keep such a tightly
integrated beast operational...) and certainly a reduction in the options
and flexibility of end users. Orocos could/should do a better job at
integration, I hope.

Herman

[RFC] Redesign of the data flow interface

> >>> * one-writer, multiple-readers model. Multiple ports can read one
> >>> single write port, but there is no way for multiple write ports to
> >>> write on the same read port.
> >>
> >> This is a "Coordination" concern, that should not be left to the
> >> Communications implementation only. In my opinion, your second policy
> >> needs an _activity_ that takes care of a (deterministic/guaranteed)
> >> implementation of this policy; it also requires a non-trivial
> >> Configuration. So, I think this (desirable!) extension should not be
> >> done in just an communication object library only.
> >
> > You will have to rephrase this, because these sentences are way too
> > obscure.
>
> Yes, they require quite some "background knowledge" about the four
> "concerns" of Coordination, Configuration, Computation and Communication
> with which the design of complex software systems can be structured in a
> nice conceptual way. More details can be found in the work of Radestock and
Again, not clear to me. Please define "they".

> >> I don't like this direct coupling between both communicating partners...
> >
> > Well. A connection (i.e. a communication in Hermanspeak) is a two-partner
> > thing. I don't see how one component can talk to the void (which it is
> > doing right now). So, yes, when one stops talking to the other that HAS
> > an effect on the other.
>
> One of the "best practices" in making Communication work in complex and
> dynamically evolving systems is to prevent components to talk to other
> components directly (in the sense that they (have to) know each other), but
> to use a "name server mediator" in between. This complexity is only needed
> for the more dynamic and complex systems, not for most of the current robot
> control systems.
Nonetheless. You *need* at the component level to know whether there *is*
someone listening to you or writing data to you. The name server mediator you
refer to does not change the fact that *you are talking to someone*. What you
don't know is *who you are talking to*. Now, taking that in reverse, what if
the agent you are talking to decides to not talk to you anymore ? Should not
you know that ? Communication, by definition, needs at least two agents.
Talking of communication layers that don't know there is "someone" at the
other side sounds really strange to me.

> BTW, CORBA has provisions for these more complex systems, but most current
> users only deal with the simpler ones, and hence find these CORBA concepts
> "bloat"... I tend to disagree, as you have undoubtedly already notices, but
> I do care about having the configuration flexibility into Orocos that
> allows developers of "simple" systems to have to use only the relevant
> concepts and primitives, without having to worry about the "bloar" required
> in the more complex and dynamic systems. Finding a good framework design to
> satisfy both needs (and a couple more needs in addition!) is though, but I
> think Orocos should try to do that effort...
Well. In the case of robotic systems, I believe that you need supervision
layers that take care of the system-level coordination/configuration,
especially because (for instance in the multi-robot case) there is decision-
making involved: having to coordinate two robots together constrains both
robots and therefore should be done under the control of a (ideally
distributed) supervision system.

The indirect part of the CORBA standard was designed to allow to transparently
move a functionality from one component to the other. In the case of robotic
systems, I believe that the same functionality should be achieved by
reconfiguration of the data flow (or message flow). Why ? Because there is the
need to have "something" deciding for the switch. Then, that "thing" can also
reconfigure the data flow to take into account the change. The name service
paradigm (and name-based references, that is how it is actually implemented)
is a more complex solution that has no added benefit in my opinion.

> >>> * thread-safety is not properly taken care of. I'd love to have a
> >>> proper R/W lock for connection/disconnection operations, but there is
> >>> none in RTT itself. Would there be a problem if I was using the boost
> >>> implementation ?
> >>
> >> Again, this belongs in Coordination. Is the boost implementation
> >> designed to be used in a multi-component environment? (That is, does it
> >> take Coordination into account?) Or is it "just" Communication level
> >> classes?
> >
> > Grmbl. You are mixing concept-level and implementation-level. Orocos is
> > inherently a multi-threaded environment, so the C++ objects that make
> > Orocos components need to be thread-safe. Period.
>
> Not just "Period."! Also in the internal architecture of Orocos we'd rather
> be sensitive to the "right" design patterns, and multi-threading is more
> than just "thread safeness"...
Yes, it is more that thread-safeness, but multi-threading IMO starts *at*
thread-safeness.

> For example, one of the most difficult
> things to get right is to get a multi-threaded application started, paused,
> and stopped in a deterministic way; this _is_ Coordination, which goes
> significantly beyond R/W locks...
This is system-level coordination, yes. What I am talking about right here is
to get a data-flow implementation right. Meaning, a basic service that can be
used by system-level coordination to get the overall system running properly.
Of course, being able to specify various types of locking (or low-level
Coordination) is a feature. It is just that I have doubts on the usefulness of
it, and that this feature would come with a significant code complexity hit.

> And Configuration is also useful to
> consider, since it might well be possible that more than one
> "multi-threading policy" should be provided.
This would for instance be doable in the new implementation. But it would come
with a code complexity cost that is in my opinion not worth it.

> > what I will NEVER agree to is to separate everything just so that it maps
> > your mindset. The communication layer (which, in this case, is limited to
> > data ports), needs to be configured and needs to be (internally)
> > coordinated. That part of the coordination and configuration HAS TO BE
> > INTEGRATED into the communication layer. Otherwise, you will end up
> > having an overengineered piece of crap.
>
> That's up to you to prove then... :-)
I don't agree. Given that I am the one coding, that's up to you to change my
mind. That's the problem with concepts vs. implementation. The only way to
know (i.e. to be rational about it) is to actually implement things both ways
and compare. Not very practical. We're talking about beliefs here, and given
that I am the one coding, I'd rather trust my beliefs in the implementation
area (while reading your comments with interest of course).

> I would also suggest not to get excited and become personal and start
> calling things "crap". Please stick to rational arguments.
Well, I tend to call things crap what I think will become crap. Orocos, as it
is implemented right now *is* overengineered. Fortunately, it is not yet a
piece of crap.

You make your suggestions based on your beliefs and I refer to mine. You
believe that everything (including the implementation) can be seen in terms of
the 4C and that improves understanding. I tend to agree. You want that mapping
to shape the implementation to the little details, I don't agree. While I do
agree that *in theory* that would lead to "something" that is very flexible, I
also am on the implementation side and *know* that seeking for too much
flexibility leads to "overengineered pieces of crap".

In other terms: proper engineering is about choosing on a case-by-case basis
whether one should stick to the concepts to clarify the situation and improve
the implementation, and whether one should "stray away" from the concepts.
Overengineering is "stick to the concepts whatever the associated
implementation costs".

[RFC] Redesign of the data flow interface

On Thu, 5 Mar 2009, Sylvain Joyeux wrote:

>>>>> * one-writer, multiple-readers model. Multiple ports can read one
>>>>> single write port, but there is no way for multiple write ports to
>>>>> write on the same read port.
>>>>
>>>> This is a "Coordination" concern, that should not be left to the
>>>> Communications implementation only. In my opinion, your second policy
>>>> needs an _activity_ that takes care of a (deterministic/guaranteed)
>>>> implementation of this policy; it also requires a non-trivial
>>>> Configuration. So, I think this (desirable!) extension should not be
>>>> done in just an communication object library only.
>>>
>>> You will have to rephrase this, because these sentences are way too
>>> obscure.
>>
>> Yes, they require quite some "background knowledge" about the four
>> "concerns" of Coordination, Configuration, Computation and Communication
>> with which the design of complex software systems can be structured in a
>> nice conceptual way. More details can be found in the work of Radestock and
> Again, not clear to me. Please define "they".

"they" = "these sentences".

>>>> I don't like this direct coupling between both communicating partners...
>>>
>>> Well. A connection (i.e. a communication in Hermanspeak) is a two-partner
>>> thing. I don't see how one component can talk to the void (which it is
>>> doing right now). So, yes, when one stops talking to the other that HAS
>>> an effect on the other.
>>
>> One of the "best practices" in making Communication work in complex and
>> dynamically evolving systems is to prevent components to talk to other
>> components directly (in the sense that they (have to) know each other), but
>> to use a "name server mediator" in between. This complexity is only needed
>> for the more dynamic and complex systems, not for most of the current robot
>> control systems.
> Nonetheless. You *need* at the component level to know whether there *is*
> someone listening to you or writing data to you.
The component has its DataPorts, and those should always be
readable/writable. Where this data is going to/coming from, and when, and
with which policy is not of direct concern of the service providing
component; it's the concern of the "coordinating Taskcontext" who knows
about all the components it has in its system and about their couplings,
communication policies, and their coordination needs.

> The name server mediator you
> refer to does not change the fact that *you are talking to someone*. What you
> don't know is *who you are talking to*. Now, taking that in reverse, what if
> the agent you are talking to decides to not talk to you anymore ?

Again, it's the _Coordinator_ component who is responsible for this
behaviour, not the individual components. And it should take appropriate
action, via events, via changes in the data port connections, via changes
in the communication policies, ...

> Should not
> you know that ? Communication, by definition, needs at least two agents.
> Talking of communication layers that don't know there is "someone" at the
> other side sounds really strange to me.
It is not at all strange to me! At least not in the context that any
intelligent system will _always_ need a Coordination component that knows
how to react appropriately. That is, how to re-configure the other
components Computation, Communication and Coordination...

>> BTW, CORBA has provisions for these more complex systems, but most current
>> users only deal with the simpler ones, and hence find these CORBA concepts
>> "bloat"... I tend to disagree, as you have undoubtedly already notices, but
>> I do care about having the configuration flexibility into Orocos that
>> allows developers of "simple" systems to have to use only the relevant
>> concepts and primitives, without having to worry about the "bloar" required
>> in the more complex and dynamic systems. Finding a good framework design to
>> satisfy both needs (and a couple more needs in addition!) is though, but I
>> think Orocos should try to do that effort...
> Well. In the case of robotic systems, I believe that you need supervision
> layers that take care of the system-level coordination/configuration,

Exactly. I call these "supervision layers" the Coordination.

> especially because (for instance in the multi-robot case) there is decision-
> making involved: having to coordinate two robots together constrains both
> robots and therefore should be done under the control of a (ideally
> distributed) supervision system.
As expected, we are really thinking along the same lines, but it takes some
time to clear this out, because of hidden assumptions and semantic
ambiguities :-)

> The indirect part of the CORBA standard was designed to allow to transparently
> move a functionality from one component to the other. In the case of robotic
> systems, I believe that the same functionality should be achieved by
> reconfiguration of the data flow (or message flow). Why ? Because there is the
> need to have "something" deciding for the switch. Then, that "thing" can also
> reconfigure the data flow to take into account the change.

This corresponds completely with my explanations above.

> The name service
> paradigm (and name-based references, that is how it is actually implemented)
> is a more complex solution that has no added benefit in my opinion.
I tend to look at it as an available _mechanism_, that the system builders
can decide to use or not, and if so, in what way. There is never an
obligation to use name serving.

>>>>> * thread-safety is not properly taken care of. I'd love to have a
>>>>> proper R/W lock for connection/disconnection operations, but there is
>>>>> none in RTT itself. Would there be a problem if I was using the boost
>>>>> implementation ?
>>>>
>>>> Again, this belongs in Coordination. Is the boost implementation
>>>> designed to be used in a multi-component environment? (That is, does it
>>>> take Coordination into account?) Or is it "just" Communication level
>>>> classes?
>>>
>>> Grmbl. You are mixing concept-level and implementation-level. Orocos is
>>> inherently a multi-threaded environment, so the C++ objects that make
>>> Orocos components need to be thread-safe. Period.
>>
>> Not just "Period."! Also in the internal architecture of Orocos we'd rather
>> be sensitive to the "right" design patterns, and multi-threading is more
>> than just "thread safeness"...
> Yes, it is more that thread-safeness, but multi-threading IMO starts *at*
> thread-safeness.
yes, as far as the Computation is concerned. I add (by default almost)
always the configuration and coordination aspects to the picture... :-) And
it really helps making things conceptually clearer. And I believe that
conceptual clarity is a good precondition for high quality
implementations...

>> For example, one of the most difficult
>> things to get right is to get a multi-threaded application started, paused,
>> and stopped in a deterministic way; this _is_ Coordination, which goes
>> significantly beyond R/W locks...
> This is system-level coordination, yes. What I am talking about right here is
> to get a data-flow implementation right. Meaning, a basic service that can be
> used by system-level coordination to get the overall system running properly.

We fully agree! :-)

> Of course, being able to specify various types of locking (or low-level
> Coordination) is a feature. It is just that I have doubts on the usefulness of
> it, and that this feature would come with a significant code complexity hit.
That is for others to decide and to implement. Coordination is anyway not a
dependency at this moment. We should just make sure that no
coordination-like features are hard baked into the current implementations
(which are mostly dealing with Computation, and some Communication).

>> And Configuration is also useful to
>> consider, since it might well be possible that more than one
>> "multi-threading policy" should be provided.
> This would for instance be doable in the new implementation. But it would come
> with a code complexity cost that is in my opinion not worth it.

Let other people decide about that, if it's not within your current
interest. Just make sure you don't prevent others from doing it :-)

>>> what I will NEVER agree to is to separate everything just so that it maps
>>> your mindset. The communication layer (which, in this case, is limited to
>>> data ports), needs to be configured and needs to be (internally)
>>> coordinated. That part of the coordination and configuration HAS TO BE
>>> INTEGRATED into the communication layer. Otherwise, you will end up
>>> having an overengineered piece of crap.
>>
>> That's up to you to prove then... :-)
> I don't agree. Given that I am the one coding, that's up to you to change my
> mind. That's the problem with concepts vs. implementation. The only way to
> know (i.e. to be rational about it) is to actually implement things both ways
> and compare.

I am not at all impressed by this kind of claims... Extrapolated to real
life that would mean that best solutions and designs can only come up via
implementation, and that rational thinking is worthless. That's not really
a compliment to 500 years of post-Renaissance evolution in the Western
world :-) It would also mean that companies would always provide better
solutions than universities...

> Not very practical. We're talking about beliefs here, and given
> that I am the one coding, I'd rather trust my beliefs in the implementation
> area (while reading your comments with interest of course).

I understand :-)

>> I would also suggest not to get excited and become personal and start
>> calling things "crap". Please stick to rational arguments.
> Well, I tend to call things crap what I think will become crap. Orocos, as it
> is implemented right now *is* overengineered. Fortunately, it is not yet a
> piece of crap.
>
> You make your suggestions based on your beliefs and I refer to mine. You
> believe that everything (including the implementation) can be seen in terms of
> the 4C and that improves understanding. I tend to agree. You want that mapping
> to shape the implementation to the little details, I don't agree. While I do
> agree that *in theory* that would lead to "something" that is very flexible, I
> also am on the implementation side and *know* that seeking for too much
> flexibility leads to "overengineered pieces of crap".

That depends on what your scope is... I want Orocos to be useful eventually
in _all_ possible robotic systems. Before we are there, a lot of
"overengineering" is still to be done :-) But I certainly agree that some
current implementations can be refactored to simplify them.

> In other terms: proper engineering is about choosing on a case-by-case basis
> whether one should stick to the concepts to clarify the situation and improve
> the implementation, and whether one should "stray away" from the concepts.
> Overengineering is "stick to the concepts whatever the associated
> implementation costs".

That depends on what your scope is... :-)

Herman