Setting component to run in realtime

We have our component, called RT Ops, which runs a 1 kHz loop using a
Timer. When the stopHook() function of RT Ops is called, it should
kill the Timer for our cyclic loop. However, we have found that this
does not always guarantee that the loop stops.

Our suspicion is that the Timer is running in a different thread from
the stopHook() -- allowing stopHook() to run while the timer's cyclic
task is already running. When this happens, the loop finishes up that
cycle -- sending new values to our hardware (which is undesired) or
segfaulting.

How do we force synchronization of this Timer and RT Ops itself?

Here is the beginning of the constructor for ECatComm, the class that
runs the cyclic timer (this class is instantiated by the main RTOps
class):

ECatComm::ECatComm(RTOps* rt_ops) :
RTT::os::Timer(1, ORO_SCHED_RT, OROCOS_PRIO) {

Here is our run script for this component:

import("atrias_rt_ops")

# Load necessary components.
loadComponent("atrias_rt", "RTOps")

setActivity("atrias_rt", 0, HighestPriority, ORO_SCHED_RT)

# Create connections
var ConnPolicy controller_req
var ConnPolicy adata
var ConnPolicy adebug

# Configure components.
atrias_rt.configure()

# Start components.
atrias_rt.start()

Setting component to run in realtime

Hi Johnathan,

On Thu, Aug 2, 2012 at 8:37 PM, Johnathan Van Why <jrvanwhy [..] ...> wrote:
> We have our component, called RT Ops, which runs a 1 kHz loop using a
> Timer. When the stopHook() function of RT Ops is called, it should
> kill the Timer for our cyclic loop. However, we have found that this
> does not always guarantee that the loop stops.
>
> Our suspicion is that the Timer is running in a different thread from
> the stopHook() -- allowing stopHook() to run while the timer's cyclic
> task is already running. When this happens, the loop finishes up that
> cycle -- sending new values to our hardware (which is undesired) or
> segfaulting.
>
> How do we force synchronization of this Timer and RT Ops itself?

Use a RTT::os::Mutex in stopHook() and the timeout function.

You might also find a construction where updateHook() is called instead
of this timeout function, in which case updateHook() and stopHook() will
never be concurrently called, as per TaskContext execution semantics.

>
> Here is the beginning of the constructor for ECatComm, the class that
> runs the cyclic timer (this class is instantiated by the main RTOps
> class):
>
> ECatComm::ECatComm(RTOps* rt_ops) :
> RTT::os::Timer(1, ORO_SCHED_RT, OROCOS_PRIO) {

We often use a OCL::TimerComponent which sends events to a port of the
RT Ops component, waking up its updateHook, which will always be
serialized in the event of a stop()/stopHook().

Peter

Setting component to run in realtime

On Thu, 2 Aug 2012, Johnathan Van Why wrote:

> We have our component, called RT Ops, which runs a 1 kHz loop using a
> Timer. When the stopHook() function of RT Ops is called, it should
> kill the Timer for our cyclic loop. However, we have found that this
> does not always guarantee that the loop stops.

Never try to "kill" one activity from another one! Instead, send an
event/message to the other activy asking it to shut down itself, properly.

You _could_ even agree on a protocol where the suicidal activity first sends
an "ack" response to the activity that requests the suicide. This extra
protocol comes in handy when your application requires several other
activities to be stopped in a coordinated way.

> Our suspicion is that the Timer is running in a different thread from
> the stopHook() -- allowing stopHook() to run while the timer's cyclic
> task is already running. When this happens, the loop finishes up that
> cycle -- sending new values to our hardware (which is undesired) or
> segfaulting.

My advice above holds even more, to the nth power, for concurrently
running activities! You can _never_ guarantee a deterministic shutdown
behaviour from one activity to another one. Unless you use the
above-mentioned interaction pattern.

Herman

> How do we force synchronization of this Timer and RT Ops itself?
>
> Here is the beginning of the constructor for ECatComm, the class that
> runs the cyclic timer (this class is instantiated by the main RTOps
> class):
>
> ECatComm::ECatComm(RTOps* rt_ops) :
> RTT::os::Timer(1, ORO_SCHED_RT, OROCOS_PRIO) {
>
> Here is our run script for this component:
>
> import("atrias_rt_ops")
>
> # Load necessary components.
> loadComponent("atrias_rt", "RTOps")
>
> setActivity("atrias_rt", 0, HighestPriority, ORO_SCHED_RT)
>
> # Create connections
> var ConnPolicy controller_req
> var ConnPolicy adata
> var ConnPolicy adebug
>
> # Configure components.
> atrias_rt.configure()
>
> # Start components.
> atrias_rt.start()

Setting component to run in realtime

On 08/02/2012 01:42 PM, Herman Bruyninckx wrote:
> On Thu, 2 Aug 2012, Johnathan Van Why wrote:
>
>> We have our component, called RT Ops, which runs a 1 kHz loop using a
>> Timer. When the stopHook() function of RT Ops is called, it should
>> kill the Timer for our cyclic loop. However, we have found that this
>> does not always guarantee that the loop stops.
> Never try to "kill" one activity from another one! Instead, send an
> event/message to the other activy asking it to shut down itself, properly.
>
> You _could_ even agree on a protocol where the suicidal activity first sends
> an "ack" response to the activity that requests the suicide. This extra
> protocol comes in handy when your application requires several other
> activities to be stopped in a coordinated way.

Isn't this what the stopHook() is for though?

Say we have a "software e-stop" component that accepts messages from
systems that are monitoring safety-related sensor data. When this
software e-stop component gets a message, it's job is to stop various
components that are causing the robot to move (trajectory followers,
I/O, etc). Why not just have the e-stop component call the stop()
method of the appropriate components? What's the benefit of adding a
"software e-stop message" port to all our e-stoppable components?

-dustin

>> Our suspicion is that the Timer is running in a different thread from
>> the stopHook() -- allowing stopHook() to run while the timer's cyclic
>> task is already running. When this happens, the loop finishes up that
>> cycle -- sending new values to our hardware (which is undesired) or
>> segfaulting.
> My advice above holds even more, to the nth power, for concurrently
> running activities! You can _never_ guarantee a deterministic shutdown
> behaviour from one activity to another one. Unless you use the
> above-mentioned interaction pattern.
>
> Herman
>
>> How do we force synchronization of this Timer and RT Ops itself?
>>
>> Here is the beginning of the constructor for ECatComm, the class that
>> runs the cyclic timer (this class is instantiated by the main RTOps
>> class):
>>
>> ECatComm::ECatComm(RTOps* rt_ops) :
>> RTT::os::Timer(1, ORO_SCHED_RT, OROCOS_PRIO) {
>>
>> Here is our run script for this component:
>>
>> import("atrias_rt_ops")
>>
>> # Load necessary components.
>> loadComponent("atrias_rt", "RTOps")
>>
>> setActivity("atrias_rt", 0, HighestPriority, ORO_SCHED_RT)
>>
>> # Create connections
>> var ConnPolicy controller_req
>> var ConnPolicy adata
>> var ConnPolicy adebug
>>
>> # Configure components.
>> atrias_rt.configure()
>>
>> # Start components.
>> atrias_rt.start()

Setting component to run in realtime

On 10/01/2012 05:44 PM, Gooding, Dustin R. (JSC-ER411) wrote:
> On 08/02/2012 01:42 PM, Herman Bruyninckx wrote:
>> On Thu, 2 Aug 2012, Johnathan Van Why wrote:
>>
>>> We have our component, called RT Ops, which runs a 1 kHz loop using a
>>> Timer. When the stopHook() function of RT Ops is called, it should
>>> kill the Timer for our cyclic loop. However, we have found that this
>>> does not always guarantee that the loop stops.
>> Never try to "kill" one activity from another one! Instead, send an
>> event/message to the other activy asking it to shut down itself, properly.
>>
>> You _could_ even agree on a protocol where the suicidal activity first sends
>> an "ack" response to the activity that requests the suicide. This extra
>> protocol comes in handy when your application requires several other
>> activities to be stopped in a coordinated way.
>
> Isn't this what the stopHook() is for though?
>
> Say we have a "software e-stop" component that accepts messages from
> systems that are monitoring safety-related sensor data. When this
> software e-stop component gets a message, it's job is to stop various
> components that are causing the robot to move (trajectory followers,
> I/O, etc). Why not just have the e-stop component call the stop()
> method of the appropriate components? What's the benefit of adding a
> "software e-stop message" port to all our e-stoppable components?

To my opinion, it isn't always save to just kill some components, e.g.
you probably don't want that your humanoid collapses because he detected
an unwanted contact in his left arm. Killing the controller is
dangerous, but also the trajectory generator is, since you don't know
what will happen (what will the components reading from the trajectory
generators ports do? put input to zero? use the last value? crash?).
The desired reaction can be complex and it depends on the application.
(The trajectory generator doesn't know what is done with its generated
set points and depending on the controller that uses these values, you
want a different behavior).
Therefore, I think sending an event is of interest, since it can deal
with these two problems (and is e.g. used in iTaSC).

nick
>
> -dustin
>
>>> Our suspicion is that the Timer is running in a different thread from
>>> the stopHook() -- allowing stopHook() to run while the timer's cyclic
>>> task is already running. When this happens, the loop finishes up that
>>> cycle -- sending new values to our hardware (which is undesired) or
>>> segfaulting.
>> My advice above holds even more, to the nth power, for concurrently
>> running activities! You can _never_ guarantee a deterministic shutdown
>> behaviour from one activity to another one. Unless you use the
>> above-mentioned interaction pattern.
>>
>> Herman
>>
>>> How do we force synchronization of this Timer and RT Ops itself?
>>>
>>> Here is the beginning of the constructor for ECatComm, the class that
>>> runs the cyclic timer (this class is instantiated by the main RTOps
>>> class):
>>>
>>> ECatComm::ECatComm(RTOps* rt_ops) :
>>> RTT::os::Timer(1, ORO_SCHED_RT, OROCOS_PRIO) {
>>>
>>> Here is our run script for this component:
>>>
>>> import("atrias_rt_ops")
>>>
>>> # Load necessary components.
>>> loadComponent("atrias_rt", "RTOps")
>>>
>>> setActivity("atrias_rt", 0, HighestPriority, ORO_SCHED_RT)
>>>
>>> # Create connections
>>> var ConnPolicy controller_req
>>> var ConnPolicy adata
>>> var ConnPolicy adebug
>>>
>>> # Configure components.
>>> atrias_rt.configure()
>>>
>>> # Start components.
>>> atrias_rt.start()
>>> --
>>> Orocos-Users mailing list
>>> Orocos-Users [..] ...
>>> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users
>>>

Setting component to run in realtime

On 10/01/2012 12:41 PM, Dominick Vanthienen wrote:

On 10/01/2012 05:44 PM, Gooding, Dustin R. (JSC-ER411) wrote:
> On 08/02/2012 01:42 PM, Herman Bruyninckx wrote:
>> On Thu, 2 Aug 2012, Johnathan Van Why wrote:
>>
>>> We have our component, called RT Ops, which runs a 1 kHz loop using a
>>> Timer. When the stopHook() function of RT Ops is called, it should
>>> kill the Timer for our cyclic loop. However, we have found that this
>>> does not always guarantee that the loop stops.
>> Never try to "kill" one activity from another one! Instead, send an
>> event/message to the other activy asking it to shut down itself, properly.
>>
>> You _could_ even agree on a protocol where the suicidal activity first sends
>> an "ack" response to the activity that requests the suicide. This extra
>> protocol comes in handy when your application requires several other
>> activities to be stopped in a coordinated way.
>
> Isn't this what the stopHook() is for though?
>
> Say we have a "software e-stop" component that accepts messages from
> systems that are monitoring safety-related sensor data. When this
> software e-stop component gets a message, it's job is to stop various
> components that are causing the robot to move (trajectory followers,
> I/O, etc). Why not just have the e-stop component call the stop()
> method of the appropriate components? What's the benefit of adding a
> "software e-stop message" port to all our e-stoppable components?

To my opinion, it isn't always save to just kill some components, e.g.
you probably don't want that your humanoid collapses because he detected
an unwanted contact in his left arm. Killing the controller is
dangerous, but also the trajectory generator is, since you don't know
what will happen (what will the components reading from the trajectory
generators ports do? put input to zero? use the last value? crash?).
The desired reaction can be complex and it depends on the application.
(The trajectory generator doesn't know what is done with its generated
set points and depending on the controller that uses these values, you
want a different behavior).
Therefore, I think sending an event is of interest, since it can deal
with these two problems (and is e.g. used in iTaSC).

nick

As an ISS payload, R2 has a lot of very strict safety requirements, some of which aren't always the "best" thing to do, robotically-speaking. Often, the required response to an unexpected event is to "ka-chunk" (killing motor power by opening relays, engaging motor brakes if available, disabling the sending of commands, etc). The ka-chunk method ensures that the robot stops moving very quickly, so as not to cause (further) damage to the ISS or astronauts, but still enables us to collect and respond to the event that caused it after things are safe. Where-as the e-stop method is simple and somewhat dumb, the clean-up and return-to-service activities are much more complex.

Now, your example of a humanoid is, agreed, a different application and the "ka-chunk" response isn't a good one.

What I'm gathering that you're saying, though, is that the message-based approach is more generic and could be used in multiple applications. In that case, I could easily understand a message-based approach as a default option, being replaced with direct stop() calls in certain, very-specific cases, if it makes sense.

-dustin

>
> -dustin
>
>>> Our suspicion is that the Timer is running in a different thread from
>>> the stopHook() -- allowing stopHook() to run while the timer's cyclic
>>> task is already running. When this happens, the loop finishes up that
>>> cycle -- sending new values to our hardware (which is undesired) or
>>> segfaulting.
>> My advice above holds even more, to the nth power, for concurrently
>> running activities! You can _never_ guarantee a deterministic shutdown
>> behaviour from one activity to another one. Unless you use the
>> above-mentioned interaction pattern.
>>
>> Herman
>>
>>> How do we force synchronization of this Timer and RT Ops itself?
>>>
>>> Here is the beginning of the constructor for ECatComm, the class that
>>> runs the cyclic timer (this class is instantiated by the main RTOps
>>> class):
>>>
>>> ECatComm::ECatComm(RTOps* rt_ops) :
>>> RTT::os::Timer(1, ORO_SCHED_RT, OROCOS_PRIO) {
>>>
>>> Here is our run script for this component:
>>>
>>> import("atrias_rt_ops")
>>>
>>> # Load necessary components.
>>> loadComponent("atrias_rt", "RTOps")
>>>
>>> setActivity("atrias_rt", 0, HighestPriority, ORO_SCHED_RT)
>>>
>>> # Create connections
>>> var ConnPolicy controller_req
>>> var ConnPolicy adata
>>> var ConnPolicy adebug
>>>
>>> # Configure components.
>>> atrias_rt.configure()
>>>
>>> # Start components.
>>> atrias_rt.start()
>>> --
>>> Orocos-Users mailing list
>>> Orocos-Users [..] ...<mailto:Orocos-Users [..] ...>
>>> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users
>>>
--
Orocos-Users mailing list
Orocos-Users [..] ...<mailto:Orocos-Users [..] ...>
http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users

Setting component to run in realtime

On 10/02/2012 04:20 PM, Gooding, Dustin R. (JSC-ER411) wrote:
> On 10/01/2012 12:41 PM, Dominick Vanthienen wrote:
>> Re: [Orocos-users] Setting component to run in realtime
>>
>> On 10/01/2012 05:44 PM, Gooding, Dustin R. (JSC-ER411) wrote:
>> > On 08/02/2012 01:42 PM, Herman Bruyninckx wrote:
>> >> On Thu, 2 Aug 2012, Johnathan Van Why wrote:
>> >>
>> >>> We have our component, called RT Ops, which runs a 1 kHz loop using a
>> >>> Timer. When the stopHook() function of RT Ops is called, it should
>> >>> kill the Timer for our cyclic loop. However, we have found that this
>> >>> does not always guarantee that the loop stops.
>> >> Never try to "kill" one activity from another one! Instead, send an
>> >> event/message to the other activy asking it to shut down itself,
>> properly.
>> >>
>> >> You _could_ even agree on a protocol where the suicidal activity
>> first sends
>> >> an "ack" response to the activity that requests the suicide. This extra
>> >> protocol comes in handy when your application requires several other
>> >> activities to be stopped in a coordinated way.
>> >
>> > Isn't this what the stopHook() is for though?
>> >
>> > Say we have a "software e-stop" component that accepts messages from
>> > systems that are monitoring safety-related sensor data. When this
>> > software e-stop component gets a message, it's job is to stop various
>> > components that are causing the robot to move (trajectory followers,
>> > I/O, etc). Why not just have the e-stop component call the stop()
>> > method of the appropriate components? What's the benefit of adding a
>> > "software e-stop message" port to all our e-stoppable components?
>>
>> To my opinion, it isn't always save to just kill some components, e.g.
>> you probably don't want that your humanoid collapses because he detected
>> an unwanted contact in his left arm. Killing the controller is
>> dangerous, but also the trajectory generator is, since you don't know
>> what will happen (what will the components reading from the trajectory
>> generators ports do? put input to zero? use the last value? crash?).
>> The desired reaction can be complex and it depends on the application.
>> (The trajectory generator doesn't know what is done with its generated
>> set points and depending on the controller that uses these values, you
>> want a different behavior).
>> Therefore, I think sending an event is of interest, since it can deal
>> with these two problems (and is e.g. used in iTaSC).
>>
>> nick
>>
>
> As an ISS payload, R2 has a lot of very strict safety requirements, some
> of which aren't always the "best" thing to do, robotically-speaking.
> Often, the required response to an unexpected event is to "ka-chunk"
> (killing motor power by opening relays, engaging motor brakes if
> available, disabling the sending of commands, etc). The ka-chunk method
> ensures that the robot stops moving very quickly, so as not to cause
> (further) damage to the ISS or astronauts, but still enables us to
> collect and respond to the event that caused it after things are safe.
> Where-as the e-stop method is simple and somewhat dumb, the clean-up and
> return-to-service activities are much more complex.
>
> Now, your example of a humanoid is, agreed, a different application and
> the "ka-chunk" response isn't a good one.
>
> What I'm gathering that you're saying, though, is that the message-based
> approach is more generic and could be used in multiple applications. In
> that case, I could easily understand a message-based approach as a
> default option, being replaced with direct stop() calls in certain,
> very-specific cases, if it makes sense.
>
indeed :)

> -dustin
>
>> >
>> > -dustin
>> >
>> >>> Our suspicion is that the Timer is running in a different thread from
>> >>> the stopHook() -- allowing stopHook() to run while the timer's cyclic
>> >>> task is already running. When this happens, the loop finishes up that
>> >>> cycle -- sending new values to our hardware (which is undesired) or
>> >>> segfaulting.
>> >> My advice above holds even more, to the nth power, for concurrently
>> >> running activities! You can _never_ guarantee a deterministic shutdown
>> >> behaviour from one activity to another one. Unless you use the
>> >> above-mentioned interaction pattern.
>> >>
>> >> Herman
>> >>
>> >>> How do we force synchronization of this Timer and RT Ops itself?
>> >>>
>> >>> Here is the beginning of the constructor for ECatComm, the class that
>> >>> runs the cyclic timer (this class is instantiated by the main RTOps
>> >>> class):
>> >>>
>> >>> ECatComm::ECatComm(RTOps* rt_ops) :
>> >>> RTT::os::Timer(1, ORO_SCHED_RT, OROCOS_PRIO) {
>> >>>
>> >>> Here is our run script for this component:
>> >>>
>> >>> import("atrias_rt_ops")
>> >>>
>> >>> # Load necessary components.
>> >>> loadComponent("atrias_rt", "RTOps")
>> >>>
>> >>> setActivity("atrias_rt", 0, HighestPriority, ORO_SCHED_RT)
>> >>>
>> >>> # Create connections
>> >>> var ConnPolicy controller_req
>> >>> var ConnPolicy adata
>> >>> var ConnPolicy adebug
>> >>>
>> >>> # Configure components.
>> >>> atrias_rt.configure()
>> >>>
>> >>> # Start components.
>> >>> atrias_rt.start()
>> >>> --
>> >>> Orocos-Users mailing list
>> >>> Orocos-Users [..] ...
>> >>> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users
>> >>>
>> --
>> Orocos-Users mailing list
>> Orocos-Users [..] ...
>> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users
>>
>

Setting component to run in realtime

> Never try to "kill" one activity from another one! Instead, send an
> event/message to the other activy asking it to shut down itself, properly.

What's the issue with this? I thought the purpose of the timer_id
parameter was so the RTT::os::Timer could identify what Timer to shut
down?

> You _could_ even agree on a protocol where the suicidal activity first sends
> an "ack" response to the activity that requests the suicide. This extra
> protocol comes in handy when your application requires several other
> activities to be stopped in a coordinated way.

We have no need for a coordinating our shutdown other that the cyclic
thread doesn't execute (at all) after we call killTimer().

> My advice above holds even more, to the nth power, for concurrently
> running activities!

These shouldn't be concurrent...

Everything here is in one single-threaded component.

Johnathan Van Why
Dynamic Robotics Laboratory
Oregon State University

Setting component to run in realtime

On Thu, 2 Aug 2012, Johnathan Van Why wrote:

>> Never try to "kill" one activity from another one! Instead, send an
>> event/message to the other activy asking it to shut down itself, properly.
>
> What's the issue with this? I thought the purpose of the timer_id
> parameter was so the RTT::os::Timer could identify what Timer to shut
> down?
>
>> You _could_ even agree on a protocol where the suicidal activity first sends
>> an "ack" response to the activity that requests the suicide. This extra
>> protocol comes in handy when your application requires several other
>> activities to be stopped in a coordinated way.
>
> We have no need for a coordinating our shutdown other that the cyclic
> thread doesn't execute (at all) after we call killTimer().
>
>> My advice above holds even more, to the nth power, for concurrently
>> running activities!
>
> These shouldn't be concurrent...

(You should keep the context of your previous messages to which people
reply; the way you do it now makes live very difficult...)

I recall that your previous email was talking about concurrency issues, but
now you seem to claim the opposite. I am confused. (I can also be wrong, of
course.)

> Everything here is in one single-threaded component.

Are you sure?

> Johnathan Van Why
> Dynamic Robotics Laboratory
> Oregon State University

Herman

Setting component to run in realtime

On Thu, Aug 2, 2012 at 12:05 PM, Herman Bruyninckx
<Herman [dot] Bruyninckx [..] ...> wrote:
> On Thu, 2 Aug 2012, Johnathan Van Why wrote:
>
>>> Never try to "kill" one activity from another one! Instead, send an
>>> event/message to the other activy asking it to shut down itself,
>>> properly.
>>
>>
>> What's the issue with this? I thought the purpose of the timer_id
>> parameter was so the RTT::os::Timer could identify what Timer to shut
>> down?
>>
>>> You _could_ even agree on a protocol where the suicidal activity first
>>> sends
>>> an "ack" response to the activity that requests the suicide. This extra
>>> protocol comes in handy when your application requires several other
>>> activities to be stopped in a coordinated way.
>>
>>
>> We have no need for a coordinating our shutdown other that the cyclic
>> thread doesn't execute (at all) after we call killTimer().
>>
>>> My advice above holds even more, to the nth power, for concurrently
>>> running activities!
>>
>>
>> These shouldn't be concurrent...
>
>
> (You should keep the context of your previous messages to which people
> reply; the way you do it now makes live very difficult...)
>
> I recall that your previous email was talking about concurrency issues, but
> now you seem to claim the opposite. I am confused. (I can also be wrong, of
> course.)
>
>> Everything here is in one single-threaded component.
> Are you sure?

We are hoping to run everything in this component in one realtime
thread. That does not seem to be happening (our timeout() and
stopHook() are running at the same time).

How do we run everything in one thread? I'd rather not have to make
this entire component threadsafe -- I've spent several hours thinking
about thread safety here and it's not particularly easy.

Thank You,
Johnathan Van Why
Dynamic Robotics Laboratory
Oregon State University