Towards an UML 2.0 compliant state machine implementation

Dear List,

Please find my suggestions and thoughts regarding the OROCOS state
machine redesign below. Apologies for the long email, but the topic
has become quite involved :-)

Summary:
--------

We want UML 2.0 style state machines for various reasons. TimeEvents,
real composite states and AND-states are powerful mechanisms;
furthermore we are now starting to work on integrating with various
model driven engineering tools, so being (reasonably) in line with the
major standards will make this much easier.

These are the major questions:

- how integrate a state machine into an OROCOS component
- how to deal with UML 2.0 variation points in a flexible way
- how to go about with scripting

Integration into OROCOS components
-----------------------------------

First of all (as discussed before) a state machine will consists of
two parts an synchronous core and an asynchronous layer around it:

1) The core layer supports a minimal interface for controlling a state
machine: starting, stopping, resetting, querying status, queuing an
event and requesting a step. The step call will only return once one a
run to completion step (or the maximum allowed supersteps) have been
executed.

2) The asynchronous layer can be seen as the glue which ties a state
machine to an OROCOS component. It defines which events are delivered
to a state machine queue and when and how the state machine is
advanced.

I identified three major use cases:

1) synchronous tracking: a state machine is to be used for tracking
the state of a component and is advanced on each CallEvent
received. The state machine needs to be stepped at after each method
invocation.

2) synchronous constraining: a state machine is stepped before the
requested operation is executed and is in control of actually allowing
or denying the it. This is the typical use case of a protocol state
machine.

3) asynchronous: a state machine never gives up control, reacts
immediately to events and therefore requires its own thread of
execution.

1) requires the following to happen in a TaskContext:
- generate a (local) CallEvent when a Method is invoked and store it
into the state machine event queue.
- step the state machine.

Probably a state machine will not be interested in _all_ CallEvents,
so adding and stepping at each CallEvent would create needless
overhead. A mechanism which allows a state machine to subscribe to
interested events would solve this.

2) this is a special case of subscribing to a CallEvent, but which
causes the actual invocation of the requested operation to be
omitted. The state machine then determines whether or not this
operation is executed and executes it itself.

3) run state machine in a TaskContext with an NonPeriodicActivity
attached

UML 2.0 variation points
-------------------------

There is really only one UML 2.0 variation point which is tricky,
namely the ChangeEvent which triggers a transition when a condition
specified by an expression evaluates to true. Peter has expressed his
concern that such an event can be efficiently implemented. I agree;
the required conditions (read the frequency) for evaluating this will
be simply to conflicting.

The proposed default therefore is to evaluate this condition before
the do behavior is executed. The rationale behind this is to allow the
ChangeEvent to trigger a transition before the do behavior is executed
(and thereby skipping its execution) while at the same time creating
as little overhead as possible.

For conditions that require more frequent evaluation there this
Event type must be realized in terms of an individual component, to
which an periodic activity with the desired frequency and priority can
be attached.

Most other variation points are rather trivial:

- Order of event dequeuing: default to FIFO.

- Allow ChangeEvents to persist even after they evaluate to false:
default to true. (We don't start searching and removing events from
the queue - double checks etc. can be performed in the next state)

Supersteps
----------

Supersteps are referred to as the behavior of a state machine which
transitions across multiple states by being externally triggered
once. This can happen e.g. if there are multiple events in a queue.

There are good reasons to allow or forbid this behavior in an
application. For more deterministic execution it would certainly be
better to forbid supersteps. But in a different scenario it might be
desirable to process a full event queue and thereby "catch up" on the
work to be done. It is therefore proposed to forbid supersteps by
default, but allow this to be overridden with a maximum allowed.

Scripting
---------

As explained in previous mails a state machine consists mostly of
scripts (in the meaning of statements interpreted at run time).
Therefore I've reached the conclusion that it would be very beneficial
to implement the underlying state machine logic itself in the
scripting language. The main advantage of this would be a high level
of flexibility for overriding and configuring semantics variation
points. The main drawback is that such a light-weight, hard real-time
scripting language expressive enough to make this feasible currently
does not exist. The problem is of course due to the inherent
difficulty of hard read time garbage collection.

As lightweight, hard real-time scripting do not exist, a intermediate
solution is required. One approach is the use of a soft real-time
scripting language. Herman suggested some time ago to take a look at
Lua [1], what I did. Lua is a mature, lightweight, MIT licensed
programing language especially targeted to be embedded. A key feature
of Lua is the recently added soft-real time incremental garbage
collector that allows fine grained control of the collection
process. Tuning these parameters makes it possible to balance temporal
behavior against overall performance.

Adopting Lua as the next generation scripting language for would have
several benefits. First of all, it provides a rather non-intrusive
migration path towards a more powerful extension language. Secondly
state machines could be initially implemented in Lua, profiled (Lua
has support for this) and then optimized by moving parts of the
implementation to C/C++. The foreign function interface for this is
very efficient and simple to use.

Of course the downside of this approach would be a (initial) decrease
in hard real time capabilities. However I believe this would be
acceptable because

- an application that has really critical real-time requirements will
implement this particular part in C/C++ anyway.

- The amount of such critical code is relatively small compared to
the amount of code for which slightly softer real time constraints
would suffice

- There is more optimization possible for periodically executing
scripts and state machines. Adaptive garbage collection algorithms
could be developed for learning to precisely schedule garbage
collection within the idle period.

That's it!

I'm sure there are many more issues yet to be discovered, but I think
it's time to cut loose. Am I given the green light?

Best regards
Markus

[1] http://www.lua.org/

Towards an UML 2.0 compliant state machine implementation

On Tuesday 24 February 2009 12:31:34 Markus Klotzbücher wrote:
...
> Integration into OROCOS components
> -----------------------------------
>
> First of all (as discussed before) a state machine will consists of
> two parts an synchronous core and an asynchronous layer around it:
>
> 1) The core layer supports a minimal interface for controlling a state
> machine: starting, stopping, resetting, querying status, queuing an
> event and requesting a step. The step call will only return once one a
> run to completion step (or the maximum allowed supersteps) have been
> executed.

I can imagine something with that... Except queueing an event maybe. Orocos
events are callbacks, so the only thing one could queue is function object,
which is already done by the EventProcessor (or MessageProcessor in RTT 2.0).

>
> 2) The asynchronous layer can be seen as the glue which ties a state
> machine to an OROCOS component. It defines which events are delivered
> to a state machine queue and when and how the state machine is
> advanced.

Does this mean this part is fully user specified ? And is this fully in a
script or partly in C++?

>
> I identified three major use cases:
>
> 1) synchronous tracking: a state machine is to be used for tracking
> the state of a component and is advanced on each CallEvent
> received. The state machine needs to be stepped at after each method
> invocation.

With all due respect, but this isn't a use case. It's a mechanism. In which
application would a user need this ? How would the (pseudo-) code look like ?

>
> 2) synchronous constraining: a state machine is stepped before the
> requested operation is executed and is in control of actually allowing
> or denying the it. This is the typical use case of a protocol state
> machine.

Ok. So this allows tying preconditions to a method invocation. Basically, you
split the code to two places, but you win that you have one place where you
specify the 'protocol'.

>
> 3) asynchronous: a state machine never gives up control, reacts
> immediately to events and therefore requires its own thread of
> execution.

This 'therefore' comes to fast for me.

>
>
> 1) requires the following to happen in a TaskContext:
> - generate a (local) CallEvent when a Method is invoked and store it
> into the state machine event queue.
> - step the state machine.
>
> Probably a state machine will not be interested in _all_ CallEvents,
> so adding and stepping at each CallEvent would create needless
> overhead. A mechanism which allows a state machine to subscribe to
> interested events would solve this.

In my opinion, this is already the case: plain RTT::Method's do not generate
any 'CallEvents'. They are very fast and light-weight. In case your SM wants
to subscribe to a call event, you add an RTT::Event to the interface, which
replaces the RTT::Method. The event has the same signature and name. In RTT
2.0, this would even happen transparantly, because adding an event to the
'call' interface would turn it into a RTT::Method or RTT::Message.

>
> 2) this is a special case of subscribing to a CallEvent, but which
> causes the actual invocation of the requested operation to be
> omitted. The state machine then determines whether or not this
> operation is executed and executes it itself.

Would this imply that there is an 'internal' name for the operation (called by
the SM) and a public name (called by the client)?

>
> 3) run state machine in a TaskContext with an NonPeriodicActivity
> attached

I don't see how this is excluded (or orthonal to) points 1. and 2.

>
>
> UML 2.0 variation points
> -------------------------
>
> There is really only one UML 2.0 variation point which is tricky,
> namely the ChangeEvent which triggers a transition when a condition
> specified by an expression evaluates to true. Peter has expressed his
> concern that such an event can be efficiently implemented. I agree;
> the required conditions (read the frequency) for evaluating this will
> be simply to conflicting.
>
> The proposed default therefore is to evaluate this condition before
> the do behavior is executed. The rationale behind this is to allow the
> ChangeEvent to trigger a transition before the do behavior is executed
> (and thereby skipping its execution) while at the same time creating
> as little overhead as possible.

That makes sense, but at which frequency is the do action executed ? Dependent
on the frequency of incoming events ?

>
> For conditions that require more frequent evaluation there this
> Event type must be realized in terms of an individual component, to
> which an periodic activity with the desired frequency and priority can
> be attached.

Equal to today's workaround.

>
> Most other variation points are rather trivial:
>
> - Order of event dequeuing: default to FIFO.

In your dreams :-) A 'default' FIFO is fragile when a client changes mind
rapidly and the SM is unable to follow in real-time. I don't have a clear
solution to this case however.

>
> - Allow ChangeEvents to persist even after they evaluate to false:
> default to true. (We don't start searching and removing events from
> the queue - double checks etc. can be performed in the next state)

We wouldn't be able to do that today either (with RTT::Event).

>
>
>
> Supersteps
> ----------
>
> Supersteps are referred to as the behavior of a state machine which
> transitions across multiple states by being externally triggered
> once. This can happen e.g. if there are multiple events in a queue.
>
> There are good reasons to allow or forbid this behavior in an
> application. For more deterministic execution it would certainly be
> better to forbid supersteps. But in a different scenario it might be
> desirable to process a full event queue and thereby "catch up" on the
> work to be done. It is therefore proposed to forbid supersteps by
> default, but allow this to be overridden with a maximum allowed.

Supersteps only matter in sequential execution ('single threaded') cases where
the queueing of the event causes the transition to be made in the same thread.
Using today's SequentialActivity, all supersteps are taken until they run out
of events. This means that if you emit an event to which you react, you go
into infinite loop. We can't do otherwise in this case. Exactly the same would
happen if you emit an RTT::Event in one of its event handler functions. The
only solution is to fix your algorithm or to decouple with a thread.

>
>
> Scripting
> ---------
...
> As lightweight, hard real-time scripting do not exist, a intermediate
> solution is required. One approach is the use of a soft real-time
> scripting language. Herman suggested some time ago to take a look at
> Lua [1], what I did. Lua is a mature, lightweight, MIT licensed
> programing language especially targeted to be embedded. A key feature
> of Lua is the recently added soft-real time incremental garbage
> collector that allows fine grained control of the collection
> process. Tuning these parameters makes it possible to balance temporal
> behavior against overall performance.
...

If Lua's garbage collector can be tied up with the RT-malloc implementation,
I'd be very much interested in having it. But I'd really like to see how one
would write down a simple Orocos state machine in Lua.

Peter

Towards an UML 2.0 compliant state machine implementation

On Mon, Mar 09, 2009 at 05:02:01PM +0100, Peter Soetens wrote:
> On Tuesday 24 February 2009 12:31:34 Markus Klotzbücher wrote:
> ...
> > Integration into OROCOS components
> > -----------------------------------
> >
> > First of all (as discussed before) a state machine will consists of
> > two parts an synchronous core and an asynchronous layer around it:
> >
> > 1) The core layer supports a minimal interface for controlling a state
> > machine: starting, stopping, resetting, querying status, queuing an
> > event and requesting a step. The step call will only return once one a
> > run to completion step (or the maximum allowed supersteps) have been
> > executed.
>
> I can imagine something with that... Except queueing an event maybe. Orocos
> events are callbacks, so the only thing one could queue is function object,
> which is already done by the EventProcessor (or MessageProcessor in RTT 2.0).

Hmm, I guess i'd better written UML::Event then :-)

> > 2) The asynchronous layer can be seen as the glue which ties a state
> > machine to an OROCOS component. It defines which events are delivered
> > to a state machine queue and when and how the state machine is
> > advanced.
>
> Does this mean this part is fully user specified ? And is this fully in a

No, not user specified. The interface should be as light as possible,
maybe only init() and exec_sm(file f).

> script or partly in C++?

The glue will certainly be at least partly in C/C++.

> > I identified three major use cases:
> >
> > 1) synchronous tracking: a state machine is to be used for tracking
> > the state of a component and is advanced on each CallEvent
> > received. The state machine needs to be stepped at after each method
> > invocation.
>
> With all due respect, but this isn't a use case. It's a mechanism. In which
> application would a user need this ? How would the (pseudo-) code look like ?

You're right. See below.

> >
> > 2) synchronous constraining: a state machine is stepped before the
> > requested operation is executed and is in control of actually allowing
> > or denying the it. This is the typical use case of a protocol state
> > machine.
>
> Ok. So this allows tying preconditions to a method invocation. Basically, you
> split the code to two places, but you win that you have one place where you
> specify the 'protocol'.
>
> >
> > 3) asynchronous: a state machine never gives up control, reacts
> > immediately to events and therefore requires its own thread of
> > execution.
>
> This 'therefore' comes to fast for me.

Never giving up control implies requiring an own thread. Or: never
giving up control implies not allowing anybody else to run.

However I agree with your critique of the "use cases". My current
point of view is that if we turn this

http://people.mech.kuleuven.be/~orocos/pub/stable/documentation/rtt/curr...

into a (meta-) state machine and move towards UML style event handling
(which is currently done by various unrelated OROCOS entities), we can
achieve what I describe above and drastically simplify matters.

> > 1) requires the following to happen in a TaskContext:
> > - generate a (local) CallEvent when a Method is invoked and store it
> > into the state machine event queue.
> > - step the state machine.
> >
> > Probably a state machine will not be interested in _all_ CallEvents,
> > so adding and stepping at each CallEvent would create needless
> > overhead. A mechanism which allows a state machine to subscribe to
> > interested events would solve this.
>
> In my opinion, this is already the case: plain RTT::Method's do not generate
> any 'CallEvents'. They are very fast and light-weight. In case your SM wants
> to subscribe to a call event, you add an RTT::Event to the interface, which
> replaces the RTT::Method. The event has the same signature and name. In RTT
> 2.0, this would even happen transparantly, because adding an event to the
> 'call' interface would turn it into a RTT::Method or RTT::Message.

Agreed, that's a way how it could be done now. But both my
"subscription" model and what you describe require extra work. With
"unified events" you get this for free (you just take from the queue
what you want).

> > 2) this is a special case of subscribing to a CallEvent, but which
> > causes the actual invocation of the requested operation to be
> > omitted. The state machine then determines whether or not this
> > operation is executed and executes it itself.
>
> Would this imply that there is an 'internal' name for the operation (called by
> the SM) and a public name (called by the client)?
>
> >
> > 3) run state machine in a TaskContext with an NonPeriodicActivity
> > attached
>
> I don't see how this is excluded (or orthonal to) points 1. and 2.

It's not. These thoughts are too old, please ignore.

> > UML 2.0 variation points
> > -------------------------
> >
> > There is really only one UML 2.0 variation point which is tricky,
> > namely the ChangeEvent which triggers a transition when a condition
> > specified by an expression evaluates to true. Peter has expressed his
> > concern that such an event can be efficiently implemented. I agree;
> > the required conditions (read the frequency) for evaluating this will
> > be simply to conflicting.
> >
> > The proposed default therefore is to evaluate this condition before
> > the do behavior is executed. The rationale behind this is to allow the
> > ChangeEvent to trigger a transition before the do behavior is executed
> > (and thereby skipping its execution) while at the same time creating
> > as little overhead as possible.
>
> That makes sense, but at which frequency is the do action executed ? Dependent
> on the frequency of incoming events ?

Exactly!!!

> > For conditions that require more frequent evaluation there this
> > Event type must be realized in terms of an individual component, to
> > which an periodic activity with the desired frequency and priority can
> > be attached.
>
> Equal to today's workaround.

Yes, as you say, all good ideas are stolen ;-)

> > Most other variation points are rather trivial:
> >
> > - Order of event dequeuing: default to FIFO.
>
> In your dreams :-) A 'default' FIFO is fragile when a client changes mind
> rapidly and the SM is unable to follow in real-time. I don't have a clear
> solution to this case however.

Yeah, I mean of course you need to limit the maximum capacity. I'm not
mentioning error handling here.

> > - Allow ChangeEvents to persist even after they evaluate to false:
> > default to true. (We don't start searching and removing events from
> > the queue - double checks etc. can be performed in the next state)
>
> We wouldn't be able to do that today either (with RTT::Event).

It would become easy with my suggestion above.

> > Supersteps
> > ----------
> >
> > Supersteps are referred to as the behavior of a state machine which
> > transitions across multiple states by being externally triggered
> > once. This can happen e.g. if there are multiple events in a queue.
> >
> > There are good reasons to allow or forbid this behavior in an
> > application. For more deterministic execution it would certainly be
> > better to forbid supersteps. But in a different scenario it might be
> > desirable to process a full event queue and thereby "catch up" on the
> > work to be done. It is therefore proposed to forbid supersteps by
> > default, but allow this to be overridden with a maximum allowed.
>
> Supersteps only matter in sequential execution ('single threaded') cases where
> the queueing of the event causes the transition to be made in the same thread.
> Using today's SequentialActivity, all supersteps are taken until they run out
> of events. This means that if you emit an event to which you react, you go
> into infinite loop. We can't do otherwise in this case. Exactly the same would
> happen if you emit an RTT::Event in one of its event handler functions. The
> only solution is to fix your algorithm or to decouple with a thread.

I think creating an infinite loop is the extreme case and likely an
obvious error. But it's a valid question if (at all) to allow
supersteps. I tend more and more to say no, and if event queues
overrun it is a design error.

BTW: SequentialActivity is not mentioned anywhere in the docs, is it?

> > Scripting
> > ---------
> ...
> > As lightweight, hard real-time scripting do not exist, a intermediate
> > solution is required. One approach is the use of a soft real-time
> > scripting language. Herman suggested some time ago to take a look at
> > Lua [1], what I did. Lua is a mature, lightweight, MIT licensed
> > programing language especially targeted to be embedded. A key feature
> > of Lua is the recently added soft-real time incremental garbage
> > collector that allows fine grained control of the collection
> > process. Tuning these parameters makes it possible to balance temporal
> > behavior against overall performance.
> ...
>
> If Lua's garbage collector can be tied up with the RT-malloc implementation,

It can.

> I'd be very much interested in having it. But I'd really like to see how one
> would write down a simple Orocos state machine in Lua.

I believe that's the easy part. The hard part is to integrate with non
trivial C++.

Best regards
Markus

Towards an UML 2.0 compliant state machine implementation

On Tue, 24 Feb 2009, Markus Klotzbücher wrote:

> Please find my suggestions and thoughts regarding the OROCOS state
> machine redesign below. Apologies for the long email, but the topic
> has become quite involved :-)

Indeed. I recently read somewhat about the Computation, Communication,
Configuration and Coordination concerns for complex software systems, and I
think this is the conceptual framework we are missing in orocos design to
make our ideas clear, and to provide the appropriate decoupling!
(It will also make Klaas happy, because his favourite "models of
computation" concept fits naturally within this 4C...)
I am referring to
@InCollection{ RadestockEisenbach1996,
author = {Radestock, Matthias and Eisenbach, Susan},
title = {Coordination in evolving systems},
booktitle = {Book: Trends in Distributed Systems CORBA and Beyond},
pages = {162--176},
year = {1996}
}

So, in my answers and remarks below, I will refer to one or more of these
4Cs quite often.

> Summary:
> --------
>
> We want UML 2.0 style state machines for various reasons. TimeEvents,
> real composite states and AND-states are powerful mechanisms;
> furthermore we are now starting to work on integrating with various
> model driven engineering tools, so being (reasonably) in line with the
> major standards will make this much easier.
>
> These are the major questions:
>
> - how integrate a state machine into an OROCOS component
> - how to deal with UML 2.0 variation points in a flexible way
> - how to go about with scripting
>
>
> Integration into OROCOS components
> -----------------------------------
>
> First of all (as discussed before) a state machine will consists of
> two parts an synchronous core and an asynchronous layer around it:
>
> 1) The core layer supports a minimal interface for controlling a state
> machine: starting, stopping, resetting, querying status, queuing an
> event and requesting a step. The step call will only return once one a
> run to completion step (or the maximum allowed supersteps) have been
> executed.

The concept of a "step" is not clear, in my opinion. It does not belong to
the traditional (and UML standardized) FSM concepts, so it should not be in
the core!

> 2) The asynchronous layer can be seen as the glue which ties a state
> machine to an OROCOS component. It defines which events are delivered
> to a state machine queue and when and how the state machine is
> advanced.

Similar remark: "asynchronous layer" is not a well-defined concept.

> I identified three major use cases:
>
> 1) synchronous tracking: a state machine is to be used for tracking
> the state of a component and is advanced on each CallEvent
> received.
Ok.

> The state machine needs to be stepped at after each method
> invocation.
Needs clarification, because of the "step"-feature.

> 2) synchronous constraining: a state machine is stepped before the
> requested operation is executed and is in control of actually allowing
> or denying the it. This is the typical use case of a protocol state
> machine.
Please explain.

> 3) asynchronous: a state machine never gives up control, reacts
> immediately to events and therefore requires its own thread of
> execution.
This is a tricky "feature": asynchronicity and state machine don't go
together. By definition, because a state machine is about changing the
behaviour of _one single activitiy_, in a synchronous way...

I do see asynchronous behaviour coordination needs in many applications,
but I have lost the conviction that FSMs are the way to go here.

> 1) requires the following to happen in a TaskContext:
> - generate a (local) CallEvent when a Method is invoked and store it
> into the state machine event queue.
> - step the state machine.
Similar remarks as above.

> Probably a state machine will not be interested in _all_ CallEvents,
> so adding and stepping at each CallEvent would create needless
> overhead. A mechanism which allows a state machine to subscribe to
> interested events would solve this.
That's a Configuration responsibility, not something the state machine
itself should be responsible for. And in addition: the state machine
should be automatically subscribed to the events that correspond to its
state transition events, isn't it?

> 2) this is a special case of subscribing to a CallEvent, but which
> causes the actual invocation of the requested operation to be
> omitted. The state machine then determines whether or not this
> operation is executed and executes it itself.

The exact meaning of "execution" in a state machine should be defined
better... It's not a clear concept to me (anymore).

> 3) run state machine in a TaskContext with an NonPeriodicActivity
> attached

We should get rid of this "feature": "NonPeriodicActivity" is a wrong
concept (I was partially responsible for its creation, sorry about that!),
since it (i) is a negative term (saying what something is _not_ is always
more confusing than saying what it _is_...) and (ii) it says it is not
something which is basically a special case in itself. This is terminology
that opens a cans of worms, as the past has proven.

> UML 2.0 variation points
> -------------------------
>
> There is really only one UML 2.0 variation point which is tricky,
> namely the ChangeEvent which triggers a transition when a condition
> specified by an expression evaluates to true. Peter has expressed his
> concern that such an event can be efficiently implemented. I agree;
> the required conditions (read the frequency) for evaluating this will
> be simply to conflicting.
I don't fully understand what you want to say here...

> The proposed default therefore is to evaluate this condition before
> the do behavior is executed. The rationale behind this is to allow the
> ChangeEvent to trigger a transition before the do behavior is executed
> (and thereby skipping its execution) while at the same time creating
> as little overhead as possible.
>
> For conditions that require more frequent evaluation there this
> Event type must be realized in terms of an individual component, to
> which an periodic activity with the desired frequency and priority can
> be attached.
Oops, I didn't realize we had made Orocos' runtime behaviour (and its Coordination) that difficult to understand... :-)

> Most other variation points are rather trivial:
>
> - Order of event dequeuing: default to FIFO.
Why is this "trivial"? Because only FIFO makes sense?

> - Allow ChangeEvents to persist even after they evaluate to false:
> default to true. (We don't start searching and removing events from
> the queue - double checks etc. can be performed in the next state)
_If_ queueing takes place, it's not the state machine that should be doing
it.

> Supersteps
> ----------
>
> Supersteps are referred to as the behavior of a state machine which
> transitions across multiple states by being externally triggered
> once. This can happen e.g. if there are multiple events in a queue.

I don't like this concept of "supersteps"... At least not for defining the
behaviour and implementation of state machines: the state machine is a
thing that changes state when events arrive. The scheduling of the code
that does this is not something that belongs to the specification of the
state machine. But "somewhere" else... in the Coordination part of RTT
(which has not yet been identified and designed very well, yet...).

> There are good reasons to allow or forbid this behavior in an
> application. For more deterministic execution it would certainly be
> better to forbid supersteps. But in a different scenario it might be
> desirable to process a full event queue and thereby "catch up" on the
> work to be done.

These are _Configuration_, which should be done outside of state machine
definition and specification.

> It is therefore proposed to forbid supersteps by
> default, but allow this to be overridden with a maximum allowed.
I don't think "forbidding" things is a good policy... We should support
_any_ policy on an equal basis, and maybe provide "OCL" templates that
motivate why certain policies are good in certain contexts.

> Scripting
> ---------
>
> As explained in previous mails a state machine consists mostly of
> scripts (in the meaning of statements interpreted at run time).
> Therefore I've reached the conclusion that it would be very beneficial

Do the scripts determine the _specification_ of the state machine, or the
event handling, or....?

> to implement the underlying state machine logic itself in the
> scripting language.
This is a strange sentence: logic is not an "executable" concept...
State machines can describe Coordination, which should be decoupled from
Computation...

> The main advantage of this would be a high level
> of flexibility for overriding and configuring semantics variation
> points. The main drawback is that such a light-weight, hard real-time
> scripting language expressive enough to make this feasible currently
> does not exist. The problem is of course due to the inherent
> difficulty of hard read time garbage collection.
>
> As lightweight, hard real-time scripting do not exist, a intermediate
> solution is required. One approach is the use of a soft real-time
> scripting language. Herman suggested some time ago to take a look at
> Lua [1], what I did. Lua is a mature, lightweight, MIT licensed
> programing language especially targeted to be embedded. A key feature
> of Lua is the recently added soft-real time incremental garbage
> collector that allows fine grained control of the collection
> process. Tuning these parameters makes it possible to balance temporal
> behavior against overall performance.
>
> Adopting Lua as the next generation scripting language for would have
> several benefits. First of all, it provides a rather non-intrusive
> migration path towards a more powerful extension language. Secondly
> state machines could be initially implemented in Lua, profiled (Lua
> has support for this) and then optimized by moving parts of the
> implementation to C/C++. The foreign function interface for this is
> very efficient and simple to use.
>
> Of course the downside of this approach would be a (initial) decrease
> in hard real time capabilities. However I believe this would be
> acceptable because
>
> - an application that has really critical real-time requirements will
> implement this particular part in C/C++ anyway.
>
> - The amount of such critical code is relatively small compared to
> the amount of code for which slightly softer real time constraints
> would suffice
>
> - There is more optimization possible for periodically executing
> scripts and state machines. Adaptive garbage collection algorithms
> could be developed for learning to precisely schedule garbage
> collection within the idle period.
>
>
> That's it!
>
> I'm sure there are many more issues yet to be discovered, but I think
> it's time to cut loose. Am I given the green light?

No! :-) These issues should be _absolutely_ clear, and currently our
"insights" are still coupling the 4Cs too much...

Herman

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

Towards an UML 2.0 compliant state machine implementation

On Wed, Feb 25, 2009 at 09:07:53PM +0100, Herman Bruyninckx wrote:
> On Tue, 24 Feb 2009, Markus Klotzbücher wrote:
>
>> Please find my suggestions and thoughts regarding the OROCOS state
>> machine redesign below. Apologies for the long email, but the topic
>> has become quite involved :-)
>
> Indeed. I recently read somewhat about the Computation, Communication,
> Configuration and Coordination concerns for complex software systems, and I
> think this is the conceptual framework we are missing in orocos design to
> make our ideas clear, and to provide the appropriate decoupling!
> (It will also make Klaas happy, because his favourite "models of
> computation" concept fits naturally within this 4C...)
> I am referring to @InCollection{ RadestockEisenbach1996,
> author = {Radestock, Matthias and Eisenbach, Susan},
> title = {Coordination in evolving systems},
> booktitle = {Book: Trends in Distributed Systems CORBA and Beyond},
> pages = {162--176},
> year = {1996}
> }
>
> So, in my answers and remarks below, I will refer to one or more of these
> 4Cs quite often.
>
>> Summary:
>> --------
>>
>> We want UML 2.0 style state machines for various reasons. TimeEvents,
>> real composite states and AND-states are powerful mechanisms;
>> furthermore we are now starting to work on integrating with various
>> model driven engineering tools, so being (reasonably) in line with the
>> major standards will make this much easier.
>>
>> These are the major questions:
>>
>> - how integrate a state machine into an OROCOS component
>> - how to deal with UML 2.0 variation points in a flexible way
>> - how to go about with scripting
>>
>>
>> Integration into OROCOS components
>> -----------------------------------
>>
>> First of all (as discussed before) a state machine will consists of
>> two parts an synchronous core and an asynchronous layer around it:
>>
>> 1) The core layer supports a minimal interface for controlling a state
>> machine: starting, stopping, resetting, querying status, queuing an
>> event and requesting a step. The step call will only return once one a
>> run to completion step (or the maximum allowed supersteps) have been
>> executed.
>
> The concept of a "step" is not clear, in my opinion. It does not belong to
> the traditional (and UML standardized) FSM concepts, so it should not be in
> the core!

I disagree! Although I should have better written "run-to-completion
step" which is the smallest execution unit of an UML state machine
(see UML Superstructure Spec, pg. 563). Quoting from that page:

"The processing of a single event occurrence by a state machine is
known as a run-to-completion step."

>> 2) The asynchronous layer can be seen as the glue which ties a state
>> machine to an OROCOS component. It defines which events are delivered
>> to a state machine queue and when and how the state machine is
>> advanced.
>
> Similar remark: "asynchronous layer" is not a well-defined concept.

Ok, agreed. The name might be a bad choice but how would you call
this? Integration layer?

>> I identified three major use cases:
>>
>> 1) synchronous tracking: a state machine is to be used for tracking
>> the state of a component and is advanced on each CallEvent
>> received.
> Ok.
>
>> The state machine needs to be stepped at after each method
>> invocation.
> Needs clarification, because of the "step"-feature.

A method invocation generates a UML::CallEvent. By stepping I mean
executing one run-to-completion step, thereby processing this
event. Of course it is not guaranteed that this event will be
processed as this depends on the amount of events in the queue and how
is dealt with supersteps.

>> 2) synchronous constraining: a state machine is stepped before the
>> requested operation is executed and is in control of actually allowing
>> or denying the it. This is the typical use case of a protocol state
>> machine.
> Please explain.

A UML protocol state machine serves to define which operations of an
object are allowed to be executed in which state, in which order and
which pre- and postconditions must hold. Transitions are labeled with
precondition, CallEvent, postcondition.

The goal is of course seperation of concerns: seperate the mechanism a
object provides from usage policy. Implementationwise this could be
generated statically or, more configurable, be evaluated at
runtime. Above I described the latter.

>> 3) asynchronous: a state machine never gives up control, reacts
>> immediately to events and therefore requires its own thread of
>> execution.
> This is a tricky "feature": asynchronicity and state machine don't go
> together. By definition, because a state machine is about changing the
> behaviour of _one single activitiy_, in a synchronous way...

I get your point, and I think my choice of "asynchronous" might be
unfortunate here again. Maybe "Master" would be better. The use case I
imagined here is a state machine in its own thread which reacts to
events as they arrive. For instance when data arrives on a port and an
event is generated the state machine could read it, invoke some
processing and write it to some out port. That would qualify as
synchronous, wouldn't it? The only difference to the other use cases
is that the behavior of the component is determined by the set of
events reacted to by its state machine, and not the external trigger.

But maybe this is misuse of a state machine and really the use case
for a script (Activity diagram).

> I do see asynchronous behaviour coordination needs in many applications,
> but I have lost the conviction that FSMs are the way to go here.

I'm curious, what would you consider better suited? Petri nets?

>> 1) requires the following to happen in a TaskContext:
>> - generate a (local) CallEvent when a Method is invoked and store it
>> into the state machine event queue.
>> - step the state machine.
> Similar remarks as above.
>
>> Probably a state machine will not be interested in _all_ CallEvents,
>> so adding and stepping at each CallEvent would create needless
>> overhead. A mechanism which allows a state machine to subscribe to
>> interested events would solve this.
> That's a Configuration responsibility, not something the state machine

I think this depends on how tight the coupling between a component and
a state machine is. If its loose I would agree, one could think of
deploying and configuring a state machine "on to" a component.

> itself should be responsible for. And in addition: the state machine
> should be automatically subscribed to the events that correspond to its
> state transition events, isn't it?

Yes, absolutely. My fault for not differentiating between state
machine instance and state machine implementation. The state machine
implementation will subscribe to the events required for processing a
certain instance of a state machine. Or as you suggest above a state
machine instance could be configured to run on a certain component.

>> 2) this is a special case of subscribing to a CallEvent, but which
>> causes the actual invocation of the requested operation to be
>> omitted. The state machine then determines whether or not this
>> operation is executed and executes it itself.
>
> The exact meaning of "execution" in a state machine should be defined
> better... It's not a clear concept to me (anymore).

see above.

>> 3) run state machine in a TaskContext with an NonPeriodicActivity
>> attached
>
> We should get rid of this "feature": "NonPeriodicActivity" is a wrong
> concept (I was partially responsible for its creation, sorry about that!),
> since it (i) is a negative term (saying what something is _not_ is always
> more confusing than saying what it _is_...) and (ii) it says it is not
> something which is basically a special case in itself. This is terminology
> that opens a cans of worms, as the past has proven.

Agreed.

>> UML 2.0 variation points
>> -------------------------
>>
>> There is really only one UML 2.0 variation point which is tricky,
>> namely the ChangeEvent which triggers a transition when a condition
>> specified by an expression evaluates to true. Peter has expressed his
>> concern that such an event can be efficiently implemented. I agree;
>> the required conditions (read the frequency) for evaluating this will
>> be simply to conflicting.
> I don't fully understand what you want to say here...

A ChangeEvent is denoted on a transition as an expression which
returns a boolean result. When this expression becomes valid, the
transition triggers. However UML leaves it completely open how and
when (e.g. at fixed times or continuously) this expression is
evaluated.

>> The proposed default therefore is to evaluate this condition before
>> the do behavior is executed. The rationale behind this is to allow the
>> ChangeEvent to trigger a transition before the do behavior is executed
>> (and thereby skipping its execution) while at the same time creating
>> as little overhead as possible.
>>
>> For conditions that require more frequent evaluation there this
>> Event type must be realized in terms of an individual component, to
>> which an periodic activity with the desired frequency and priority can
>> be attached.
> Oops, I didn't realize we had made Orocos' runtime behaviour (and
> its Coordination) that difficult to understand... :-)

Huh, what am I missing?

>> Most other variation points are rather trivial:
>>
>> - Order of event dequeuing: default to FIFO.
> Why is this "trivial"? Because only FIFO makes sense?

No, because only FIFO makes sense *as a default*. But for instance
when dealing with large amounts of events (so that the event queue is
constantly filled) some prioritization scheme could make sense.

>> - Allow ChangeEvents to persist even after they evaluate to false:
>> default to true. (We don't start searching and removing events from
>> the queue - double checks etc. can be performed in the next state)
> _If_ queueing takes place, it's not the state machine that should be doing
> it.

As the detection of ChangeEvents involves evaluation of expressions
which are part of the state machine description, I assume it to be a
reasonable choice to evaluate these by the state machine
implementation, which in turn would be responsible for queuing the
event. Whoever queues the event should remove it too.

>> Supersteps
>> ----------
>>
>> Supersteps are referred to as the behavior of a state machine which
>> transitions across multiple states by being externally triggered
>> once. This can happen e.g. if there are multiple events in a queue.
>
> I don't like this concept of "supersteps"... At least not for defining the
> behaviour and implementation of state machines: the state machine is a
> thing that changes state when events arrive. The scheduling of the code
> that does this is not something that belongs to the specification of the
> state machine. But "somewhere" else... in the Coordination part of RTT
> (which has not yet been identified and designed very well, yet...).

I agree.

>> There are good reasons to allow or forbid this behavior in an
>> application. For more deterministic execution it would certainly be
>> better to forbid supersteps. But in a different scenario it might be
>> desirable to process a full event queue and thereby "catch up" on the
>> work to be done.
>
> These are _Configuration_, which should be done outside of state machine
> definition and specification.

Yes I agree. I don't see your last two paragraphs conflicting with
what I wrote?

>> It is therefore proposed to forbid supersteps by
>> default, but allow this to be overridden with a maximum allowed.
> I don't think "forbidding" things is a good policy... We should support
> _any_ policy on an equal basis, and maybe provide "OCL" templates that
> motivate why certain policies are good in certain contexts.

That's what I wrote, isn't it? Disallow by default but allow for
configuration?

>> Scripting
>> ---------
>>
>> As explained in previous mails a state machine consists mostly of
>> scripts (in the meaning of statements interpreted at run time).
>> Therefore I've reached the conclusion that it would be very beneficial
>
> Do the scripts determine the _specification_ of the state machine, or the
> event handling, or....?

Most importantly these scripts determine the behavior of a state
machine. An example for a script is a guard condition, which when
evaluated determines whether a transition will be enabled or not. In a
way the scripts also affect the specification, as the latter will have
to adhere to the syntax of the first.

>> to implement the underlying state machine logic itself in the
>> scripting language.
> This is a strange sentence: logic is not an "executable" concept...
> State machines can describe Coordination, which should be decoupled from
> Computation...

Yes, but describing this coordination requires computation, namely
computing the change in state given a certain input! And while it is
an implementation detail of whether this computation is implemented in
Lua, C++ or whatever, this choice will have major impact on
configurability, usability and also reusability!

>> The main advantage of this would be a high level
>> of flexibility for overriding and configuring semantics variation
>> points. The main drawback is that such a light-weight, hard real-time
>> scripting language expressive enough to make this feasible currently
>> does not exist. The problem is of course due to the inherent
>> difficulty of hard read time garbage collection.
>>
>> As lightweight, hard real-time scripting do not exist, a intermediate
>> solution is required. One approach is the use of a soft real-time
>> scripting language. Herman suggested some time ago to take a look at
>> Lua [1], what I did. Lua is a mature, lightweight, MIT licensed
>> programing language especially targeted to be embedded. A key feature
>> of Lua is the recently added soft-real time incremental garbage
>> collector that allows fine grained control of the collection
>> process. Tuning these parameters makes it possible to balance temporal
>> behavior against overall performance.
>>
>> Adopting Lua as the next generation scripting language for would have
>> several benefits. First of all, it provides a rather non-intrusive
>> migration path towards a more powerful extension language. Secondly
>> state machines could be initially implemented in Lua, profiled (Lua
>> has support for this) and then optimized by moving parts of the
>> implementation to C/C++. The foreign function interface for this is
>> very efficient and simple to use.
>>
>> Of course the downside of this approach would be a (initial) decrease
>> in hard real time capabilities. However I believe this would be
>> acceptable because
>>
>> - an application that has really critical real-time requirements will
>> implement this particular part in C/C++ anyway.
>>
>> - The amount of such critical code is relatively small compared to
>> the amount of code for which slightly softer real time constraints
>> would suffice
>>
>> - There is more optimization possible for periodically executing
>> scripts and state machines. Adaptive garbage collection algorithms
>> could be developed for learning to precisely schedule garbage
>> collection within the idle period.
>>
>>
>> That's it!
>>
>> I'm sure there are many more issues yet to be discovered, but I think
>> it's time to cut loose. Am I given the green light?
>
> No! :-) These issues should be _absolutely_ clear, and currently our
> "insights" are still coupling the 4Cs too much...

I knew it ;-)

Best regards
Markus

Towards an UML 2.0 compliant state machine implementation

on thu, 26 feb 2009, markus klotzbücher wrote:

> on wed, feb 25, 2009 at 09:07:53pm +0100, herman bruyninckx wrote:
>> on tue, 24 feb 2009, markus klotzbücher wrote:
>>
>>> please find my suggestions and thoughts regarding the orocos state
>>> machine redesign below. apologies for the long email, but the topic
>>> has become quite involved :-)
>>
>> indeed. i recently read somewhat about the computation, communication,
>> configuration and coordination concerns for complex software systems, and i
>> think this is the conceptual framework we are missing in orocos design to
>> make our ideas clear, and to provide the appropriate decoupling!
>> (it will also make klaas happy, because his favourite "models of
>> computation" concept fits naturally within this 4c...)
>> i am referring to @incollection{ radestockeisenbach1996,
>> author = {radestock, matthias and eisenbach, susan},
>> title = {coordination in evolving systems},
>> booktitle = {book: trends in distributed systems corba and beyond},
>> pages = {162--176},
>> year = {1996}
>> }
>>
>> so, in my answers and remarks below, i will refer to one or more of these
>> 4cs quite often.
>>
>>> summary:
>>> --------
>>>
>>> we want uml 2.0 style state machines for various reasons. timeevents,
>>> real composite states and and-states are powerful mechanisms;
>>> furthermore we are now starting to work on integrating with various
>>> model driven engineering tools, so being (reasonably) in line with the
>>> major standards will make this much easier.
>>>
>>> these are the major questions:
>>>
>>> - how integrate a state machine into an orocos component
>>> - how to deal with uml 2.0 variation points in a flexible way
>>> - how to go about with scripting
>>>
>>>
>>> integration into orocos components
>>> -----------------------------------
>>>
>>> first of all (as discussed before) a state machine will consists of
>>> two parts an synchronous core and an asynchronous layer around it:
>>>
>>> 1) the core layer supports a minimal interface for controlling a state
>>> machine: starting, stopping, resetting, querying status, queuing an
>>> event and requesting a step. the step call will only return once one a
>>> run to completion step (or the maximum allowed supersteps) have been
>>> executed.
>>
>> the concept of a "step" is not clear, in my opinion. it does not belong to
>> the traditional (and uml standardized) fsm concepts, so it should not be in
>> the core!
>
> i disagree! although i should have better written "run-to-completion
> step" which is the smallest execution unit of an uml state machine
> (see uml superstructure spec, pg. 563). quoting from that page:
>
> "the processing of a single event occurrence by a state machine is
> known as a run-to-completion step."

ok, i stand corrected! so, please always use the official terminology _if_
it exists, in order to avoid confusion.

what does uml tell us about the computational aspects of "run-to-completion
steps"?

>>> 2) the asynchronous layer can be seen as the glue which ties a state
>>> machine to an orocos component. it defines which events are delivered
>>> to a state machine queue and when and how the state machine is
>>> advanced.
>>
>> similar remark: "asynchronous layer" is not a well-defined concept.
>
> ok, agreed. the name might be a bad choice but how would you call
> this? integration layer?

"coordination" (as one of the four cs, together with computation,
communication and configuration). communication and coordination deal with
asynchronicity, the others shouldn't (if this is possible...).

>>> i identified three major use cases:
>>>
>>> 1) synchronous tracking: a state machine is to be used for tracking
>>> the state of a component and is advanced on each callevent
>>> received.
>> ok.
>>
>>> the state machine needs to be stepped at after each method
>>> invocation.
>> needs clarification, because of the "step"-feature.
>
> a method invocation generates a uml::callevent. by stepping i mean
> executing one run-to-completion step, thereby processing this
> event. of course it is not guaranteed that this event will be
> processed as this depends on the amount of events in the queue and how
> is dealt with supersteps.

this is "configuration"? what does uml say about which policies are
valid/allowed/...? or is this a "semantic deviation point"?

>>> 2) synchronous constraining: a state machine is stepped before the
>>> requested operation is executed and is in control of actually allowing
>>> or denying the it. this is the typical use case of a protocol state
>>> machine.
>> please explain.
>
> a uml protocol state machine serves to define which operations of an
> object are allowed to be executed in which state, in which order and
> which pre- and postconditions must hold. transitions are labeled with
> precondition, callevent, postcondition.

ok, this is all standard uml-speak? do the precondition and postcondition
also involve their own states/transitions/steps/...?

> the goal is of course seperation of concerns: seperate the mechanism a
> object provides from usage policy. implementationwise this could be
> generated statically or, more configurable, be evaluated at
> runtime. above i described the latter.
>
>>> 3) asynchronous: a state machine never gives up control, reacts
>>> immediately to events and therefore requires its own thread of
>>> execution.
>> this is a tricky "feature": asynchronicity and state machine don't go
>> together. by definition, because a state machine is about changing the
>> behaviour of _one single activitiy_, in a synchronous way...
>
> i get your point, and i think my choice of "asynchronous" might be
> unfortunate here again. maybe "master" would be better.

that terminoogy doesn't make my understanding any deeper... :-)

> the use case i
> imagined here is a state machine in its own thread

...by definition...

> which reacts to
> events as they arrive. for instance when data arrives on a port and an
> event is generated the state machine could read it, invoke some
> processing and write it to some out port. that would qualify as
> synchronous, wouldn't it?

yes! and maybe we should make it _atomic_ too...? (The "Genom" line of work
from LAAS in Toulouse has taken atomic execution (of so-called "codels") as
their main Computation primitive; and I start to like it more and more, in
some contexts; such as this one of the state machine execution)
The question then is: atomic wrt to which other activities... brobably it's
enough to have atomicity wrt to processing the other events in the same
state machine. a "guaranteed commit" processing, in data base speak. or in
other words: for the event handling infrastructure, a state can always be
only in one determined state, and time is "frozen" for the event handling
infrastructure when a state is making a transition.

> the only difference to the other use cases
> is that the behavior of the component is determined by the set of
> events reacted to by its state machine, and not the external trigger.

i don't understand...

> but maybe this is misuse of a state machine and really the use case
> for a script (activity diagram).
interesting remark! i don't have a clear answer however...

>> i do see asynchronous behaviour coordination needs in many applications,
>> but i have lost the conviction that fsms are the way to go here.
>
> i'm curious, what would you consider better suited? petri nets?
that's the simplest concept that i know that explicitly carries the concept
of activities, hence of asynchronicity. like state machines, petri nets
come in many colours and flavours, so also there we have to be able to sort
out what the 4cs are, and what (communication) mechanism and (coordination)
policy is.

>>> 1) requires the following to happen in a taskcontext:
>>> - generate a (local) callevent when a method is invoked and store it
>>> into the state machine event queue.
>>> - step the state machine.
>> similar remarks as above.
>>
>>> probably a state machine will not be interested in _all_ callevents,
>>> so adding and stepping at each callevent would create needless
>>> overhead. a mechanism which allows a state machine to subscribe to
>>> interested events would solve this.
>> that's a configuration responsibility, not something the state machine
>
> i think this depends on how tight the coupling between a component and
> a state machine is.
that should be a 100% tight coupling: the state machine defines the
behaviour of the component! unless we discriminate between single-activity
components and multi-activity components... the latter should get a
different name, i think.

> if its loose i would agree, one could think of
> deploying and configuring a state machine "on to" a component.
>
>> itself should be responsible for. and in addition: the state machine
>> should be automatically subscribed to the events that correspond to its
>> state transition events, isn't it?
>
> yes, absolutely. my fault for not differentiating between state
> machine instance and state machine implementation. the state machine
> implementation will subscribe to the events required for processing a
> certain instance of a state machine. or as you suggest above a state
> machine instance could be configured to run on a certain component.
>
>>> 2) this is a special case of subscribing to a callevent, but which
>>> causes the actual invocation of the requested operation to be
>>> omitted. the state machine then determines whether or not this
>>> operation is executed and executes it itself.
>>
>> the exact meaning of "execution" in a state machine should be defined
>> better... it's not a clear concept to me (anymore).
>
> see above.
ok.

>>> 3) run state machine in a taskcontext with an nonperiodicactivity
>>> attached
>>
>> we should get rid of this "feature": "nonperiodicactivity" is a wrong
>> concept (i was partially responsible for its creation, sorry about that!),
>> since it (i) is a negative term (saying what something is _not_ is always
>> more confusing than saying what it _is_...) and (ii) it says it is not
>> something which is basically a special case in itself. this is terminology
>> that opens a cans of worms, as the past has proven.
>
> agreed.
>
>>> uml 2.0 variation points
>>> -------------------------
>>>
>>> there is really only one uml 2.0 variation point which is tricky,
>>> namely the changeevent which triggers a transition when a condition
>>> specified by an expression evaluates to true. peter has expressed his
>>> concern that such an event can be efficiently implemented. i agree;
>>> the required conditions (read the frequency) for evaluating this will
>>> be simply to conflicting.
>> i don't fully understand what you want to say here...
>
> a changeevent is denoted on a transition as an expression which
> returns a boolean result. when this expression becomes valid, the
> transition triggers. however uml leaves it completely open how and
> when (e.g. at fixed times or continuously) this expression is
> evaluated.

ok, this is an important "semantic deviation point"! (i like that name,
although it is way to "learned" :-))

And my feeling is that klaas' models of computation, or Radestock and
Eisenbach's 4Cs, are the appropriate concepts to use to define mechanism
and policy in this respect.

>>> The proposed default therefore is to evaluate this condition before
>>> the do behavior is executed. The rationale behind this is to allow the
>>> ChangeEvent to trigger a transition before the do behavior is executed
>>> (and thereby skipping its execution) while at the same time creating
>>> as little overhead as possible.
>>>
>>> For conditions that require more frequent evaluation there this
>>> Event type must be realized in terms of an individual component, to
>>> which an periodic activity with the desired frequency and priority can
>>> be attached.
>> Oops, I didn't realize we had made Orocos' runtime behaviour (and
>> its Coordination) that difficult to understand... :-)
>
> Huh, what am I missing?
What am _I_ missing? :-) I really didn't (and still don't) understand that
last paragraph of yours :-)

>>> Most other variation points are rather trivial:
>>>
>>> - Order of event dequeuing: default to FIFO.
>> Why is this "trivial"? Because only FIFO makes sense?
>
> No, because only FIFO makes sense *as a default*. But for instance
> when dealing with large amounts of events (so that the event queue is
> constantly filled) some prioritization scheme could make sense.
OK.

>>> - Allow ChangeEvents to persist even after they evaluate to false:
>>> default to true. (We don't start searching and removing events from
>>> the queue - double checks etc. can be performed in the next state)
>> _If_ queueing takes place, it's not the state machine that should be doing
>> it.
>
> As the detection of ChangeEvents involves evaluation of expressions
> which are part of the state machine description, I assume it to be a
> reasonable choice to evaluate these by the state machine
> implementation, which in turn would be responsible for queuing the
> event. Whoever queues the event should remove it too.
Yes!

>>> Supersteps
>>> ----------
>>>
>>> Supersteps are referred to as the behavior of a state machine which
>>> transitions across multiple states by being externally triggered
>>> once. This can happen e.g. if there are multiple events in a queue.
>>
>> I don't like this concept of "supersteps"... At least not for defining the
>> behaviour and implementation of state machines: the state machine is a
>> thing that changes state when events arrive. The scheduling of the code
>> that does this is not something that belongs to the specification of the
>> state machine. But "somewhere" else... in the Coordination part of RTT
>> (which has not yet been identified and designed very well, yet...).
>
> I agree.
>
>>> There are good reasons to allow or forbid this behavior in an
>>> application. For more deterministic execution it would certainly be
>>> better to forbid supersteps. But in a different scenario it might be
>>> desirable to process a full event queue and thereby "catch up" on the
>>> work to be done.
>>
>> These are _Configuration_, which should be done outside of state machine
>> definition and specification.
>
> Yes I agree. I don't see your last two paragraphs conflicting with
> what I wrote?
Then that's fine! :-)

>>> It is therefore proposed to forbid supersteps by
>>> default, but allow this to be overridden with a maximum allowed.
>> I don't think "forbidding" things is a good policy... We should support
>> _any_ policy on an equal basis, and maybe provide "OCL" templates that
>> motivate why certain policies are good in certain contexts.
>
> That's what I wrote, isn't it? Disallow by default but allow for
> configuration?
Okay, I now understand the nuance in your statement :-)

>>> Scripting
>>> ---------
>>>
>>> As explained in previous mails a state machine consists mostly of
>>> scripts (in the meaning of statements interpreted at run time).
>>> Therefore I've reached the conclusion that it would be very beneficial
>>
>> Do the scripts determine the _specification_ of the state machine, or the
>> event handling, or....?
>
> Most importantly these scripts determine the behavior of a state
> machine. An example for a script is a guard condition, which when
> evaluated determines whether a transition will be enabled or not. In a
> way the scripts also affect the specification, as the latter will have
> to adhere to the syntax of the first.
>
>>> to implement the underlying state machine logic itself in the
>>> scripting language.
>> This is a strange sentence: logic is not an "executable" concept...
>> State machines can describe Coordination, which should be decoupled from
>> Computation...
>
> Yes, but describing this coordination requires computation, namely
> computing the change in state given a certain input!
Agreed. But it would be very good if we could indefinitely keep the
abstract concept of a state machine separate from the implementation of its
(atomic?) execution.

> And while it is
> an implementation detail of whether this computation is implemented in
> Lua, C++ or whatever, this choice will have major impact on
> configurability, usability and also reusability!
Yes. Things will only be reusable if their meaning and implementation
effects are fully documented (and predictable).

>>> The main advantage of this would be a high level
>>> of flexibility for overriding and configuring semantics variation
>>> points. The main drawback is that such a light-weight, hard real-time
>>> scripting language expressive enough to make this feasible currently
>>> does not exist. The problem is of course due to the inherent
>>> difficulty of hard read time garbage collection.
>>>
>>> As lightweight, hard real-time scripting do not exist, a intermediate
>>> solution is required. One approach is the use of a soft real-time
>>> scripting language. Herman suggested some time ago to take a look at
>>> Lua [1], what I did. Lua is a mature, lightweight, MIT licensed
>>> programing language especially targeted to be embedded. A key feature
>>> of Lua is the recently added soft-real time incremental garbage
>>> collector that allows fine grained control of the collection
>>> process. Tuning these parameters makes it possible to balance temporal
>>> behavior against overall performance.
>>>
>>> Adopting Lua as the next generation scripting language for would have
>>> several benefits. First of all, it provides a rather non-intrusive
>>> migration path towards a more powerful extension language. Secondly
>>> state machines could be initially implemented in Lua, profiled (Lua
>>> has support for this) and then optimized by moving parts of the
>>> implementation to C/C++. The foreign function interface for this is
>>> very efficient and simple to use.
>>>
>>> Of course the downside of this approach would be a (initial) decrease
>>> in hard real time capabilities. However I believe this would be
>>> acceptable because
>>>
>>> - an application that has really critical real-time requirements will
>>> implement this particular part in C/C++ anyway.
>>>
>>> - The amount of such critical code is relatively small compared to
>>> the amount of code for which slightly softer real time constraints
>>> would suffice
>>>
>>> - There is more optimization possible for periodically executing
>>> scripts and state machines. Adaptive garbage collection algorithms
>>> could be developed for learning to precisely schedule garbage
>>> collection within the idle period.
>>>
>>>
>>> That's it!
>>>
>>> I'm sure there are many more issues yet to be discovered, but I think
>>> it's time to cut loose. Am I given the green light?
>>
>> No! :-) These issues should be _absolutely_ clear, and currently our
>> "insights" are still coupling the 4Cs too much...
>
> I knew it ;-)
>
> Best regards
> Markus

Herman

--
K.U.Leuven, Mechanical Eng., Mechatronics & Robotics Research Group
<http://people.mech.kuleuven.be/~bruyninc> Tel: +32 16 322480
Coordinator of EURON (European Robotics Research Network)
<http://www.euron.org>
Open Realtime Control Services <http://www.orocos.org>

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm