Multiple port readers

We are wondering what is the precise sequence of events in RTT v1, when one component writes to a data port, and more the one other components read from the same port/connection?

We have a coordinating component (Master activity) with three sub-components (Slave activities) that it calls in order on a given cycle: A, B, and then C. The coordinating component has the highest importance/priority in our system (real time Linux). A less important component J produces an output (the writer for the data port) that A and C read (the two readers of the data port). Trouble is, A and C occasionally get different values on a given cycle (we can see this in our logs), suggesting that J is running in between, or that its data differs somehow between when A and C read it. This is a real artifact, as it turns into visible vibration in hardware ... :-(

We don't think this has anything to do with A or B blocking, therefore allowing J to run in between A and C. We also don't think that priority inversion is involved.

Which brings me back to the question - how exactly is data moved from the writer of a data port to more than one readers of said port? The data in question is a pre-allocated std::vector<double>, in case it has any bearing.

Thanks
S

Multiple port readers

On Wed, May 18, 2011 at 2:11 AM, S Roderick <kiwi [dot] net [..] ...> wrote:
> We are wondering what is the precise sequence of events in RTT v1, when one component writes to a data port, and more the one other components read from the same port/connection?
>
> We have a coordinating component (Master activity) with three sub-components (Slave activities) that it calls in order on a given cycle: A, B, and then C. The coordinating component has the highest importance/priority in our system (real time Linux). A less important component J produces an output (the writer for the data port) that A and C read (the two readers of the data port). Trouble is, A and C occasionally get different values on a given cycle (we can see this in our logs), suggesting that J is running in between, or that its data differs somehow between when A and C read it. This is a real artifact, as it turns into visible vibration in hardware ... :-(
>
> We don't think this has anything to do with A or B blocking, therefore allowing J to run in between A and C. We also don't think that priority inversion is involved.
>
> Which brings me back to the question - how exactly is data moved from the writer of a data port to more than one readers of said port? The data in question is a pre-allocated std::vector<double>, in case it has any bearing.

Connections in 1.x are fairly simple. You can visualise it as a
central container that holds the data, and any read/write port has a
pointer to that container and can read or write it. The container
takes care of the thread-safe read/write of data. So data isn't moved
around that mutch. I think J is indeed running in between and that it
causes this difference. What we do to be sure that a high-prio
component is not interrupted, we run it in a Xenomai primary domain
thread and look for mode switches. Which target are you using btw ?

There was a bug in DataObjectLockFree that one of lock-free regions
was not properly initialized and which caused a single mode switch (no
data corruption). That patch is on trunk as well. But if you have it
frequently, I would suggest that the priority inversion comes from
somewhere else, and that in fact, Master is not running as good as you
think.

Peter

Multiple port readers

On Wed, 18 May 2011, Peter Soetens wrote:

> On Wed, May 18, 2011 at 2:11 AM, S Roderick <kiwi [dot] net [..] ...> wrote:
>> We are wondering what is the precise sequence of events in RTT v1, when one component writes to a data port, and more the one other components read from the same port/connection?
>>
>> We have a coordinating component (Master activity) with three sub-components (Slave activities) that it calls in order on a given cycle: A, B, and then C. The coordinating component has the highest importance/priority in our system (real time Linux). A less important component J produces an output (the writer for the data port) that A and C read (the two readers of the data port). Trouble is, A and C occasionally get different values on a given cycle (we can see this in our logs), suggesting that J is running in between, or that its data differs somehow between when A and C read it. This is a real artifact, as it turns into visible vibration in hardware ... :-(
>>
>> We don't think this has anything to do with A or B blocking, therefore allowing J to run in between A and C. We also don't think that priority inversion is involved.
>>
>> Which brings me back to the question - how exactly is data moved from the writer of a data port to more than one readers of said port? The data in question is a pre-allocated std::vector<double>, in case it has any bearing.
>
> Connections in 1.x are fairly simple. You can visualise it as a
> central container that holds the data, and any read/write port has a
> pointer to that container and can read or write it. The container
> takes care of the thread-safe read/write of data. So data isn't moved
> around that mutch. I think J is indeed running in between and that it
> causes this difference. What we do to be sure that a high-prio
> component is not interrupted, we run it in a Xenomai primary domain
> thread and look for mode switches. Which target are you using btw ?
>
> There was a bug in DataObjectLockFree that one of lock-free regions
> was not properly initialized and which caused a single mode switch (no
> data corruption). That patch is on trunk as well. But if you have it
> frequently, I would suggest that the priority inversion comes from
> somewhere else, and that in fact, Master is not running as good as you
> think.

Which brings me to me eternal mantra: "Never use priorities to guarantee
Coordination": it's the _logic_ in your application that is responsible for
guaranteed Coordination. (In case you need such garanties.) Priorities can
help to improve the _performance_ of the Coordination, but can/should never
be expected to _realise_ it.

> Peter

Herman

Multiple port readers

On May 18, 2011, at 04:53 , Herman Bruyninckx wrote:

> On Wed, 18 May 2011, Peter Soetens wrote:
>
>> On Wed, May 18, 2011 at 2:11 AM, S Roderick <kiwi [dot] net [..] ...> wrote:
>>> We are wondering what is the precise sequence of events in RTT v1, when one component writes to a data port, and more the one other components read from the same port/connection?
>>>
>>> We have a coordinating component (Master activity) with three sub-components (Slave activities) that it calls in order on a given cycle: A, B, and then C. The coordinating component has the highest importance/priority in our system (real time Linux). A less important component J produces an output (the writer for the data port) that A and C read (the two readers of the data port). Trouble is, A and C occasionally get different values on a given cycle (we can see this in our logs), suggesting that J is running in between, or that its data differs somehow between when A and C read it. This is a real artifact, as it turns into visible vibration in hardware ... :-(
>>>
>>> We don't think this has anything to do with A or B blocking, therefore allowing J to run in between A and C. We also don't think that priority inversion is involved.
>>>
>>> Which brings me back to the question - how exactly is data moved from the writer of a data port to more than one readers of said port? The data in question is a pre-allocated std::vector<double>, in case it has any bearing.
>>
>> Connections in 1.x are fairly simple. You can visualise it as a
>> central container that holds the data, and any read/write port has a
>> pointer to that container and can read or write it. The container
>> takes care of the thread-safe read/write of data. So data isn't moved
>> around that mutch. I think J is indeed running in between and that it
>> causes this difference. What we do to be sure that a high-prio
>> component is not interrupted, we run it in a Xenomai primary domain
>> thread and look for mode switches. Which target are you using btw ?

That's what I thought, but wanted to double check. This is PREEMPT_RT Linux, not Xenomai, on x86_64 stock Dell hardware.

I agree - I think J is running in between, but I'm having trouble determing _how_ that can be happening. We'll keep digging ...

As a matter of interest, can anything interesting happen (eg blocking, yielding), when a component with a Master activity sequences calls to a series of Slave update() functions in a state machine? Is anything unusual happening within the slave activity implementation?

>> There was a bug in DataObjectLockFree that one of lock-free regions
>> was not properly initialized and which caused a single mode switch (no
>> data corruption). That patch is on trunk as well. But if you have it
>> frequently, I would suggest that the priority inversion comes from
>> somewhere else, and that in fact, Master is not running as good as you
>> think.
>
> Which brings me to me eternal mantra: "Never use priorities to guarantee
> Coordination": it's the _logic_ in your application that is responsible for
> guaranteed Coordination. (In case you need such garanties.) Priorities can
> help to improve the _performance_ of the Coordination, but can/should never
> be expected to _realise_ it.

So can you outline how you would solve the above with logic in RTT v1 then?
S

Multiple port readers

On Wednesday 18 May 2011 13:14:51 Stephen Roderick wrote:
> On May 18, 2011, at 04:53 , Herman Bruyninckx wrote:
> > On Wed, 18 May 2011, Peter Soetens wrote:
> >> On Wed, May 18, 2011 at 2:11 AM, S Roderick <kiwi [dot] net [..] ...> wrote:
> >>> We are wondering what is the precise sequence of events in RTT v1, when
> >>> one component writes to a data port, and more the one other components
> >>> read from the same port/connection?
> >>>
> >>> We have a coordinating component (Master activity) with three
> >>> sub-components (Slave activities) that it calls in order on a given
> >>> cycle: A, B, and then C. The coordinating component has the highest
> >>> importance/priority in our system (real time Linux). A less important
> >>> component J produces an output (the writer for the data port) that A
> >>> and C read (the two readers of the data port). Trouble is, A and C
> >>> occasionally get different values on a given cycle (we can see this in
> >>> our logs), suggesting that J is running in between, or that its data
> >>> differs somehow between when A and C read it. This is a real artifact,
> >>> as it turns into visible vibration in hardware ... :-(
> >>>
> >>> We don't think this has anything to do with A or B blocking, therefore
> >>> allowing J to run in between A and C. We also don't think that
> >>> priority inversion is involved.
> >>>
> >>> Which brings me back to the question - how exactly is data moved from
> >>> the writer of a data port to more than one readers of said port? The
> >>> data in question is a pre-allocated std::vector<double>, in case it
> >>> has any bearing.
> >>
> >> Connections in 1.x are fairly simple. You can visualise it as a
> >> central container that holds the data, and any read/write port has a
> >> pointer to that container and can read or write it. The container
> >> takes care of the thread-safe read/write of data. So data isn't moved
> >> around that mutch. I think J is indeed running in between and that it
> >> causes this difference. What we do to be sure that a high-prio
> >> component is not interrupted, we run it in a Xenomai primary domain
> >> thread and look for mode switches. Which target are you using btw ?
>
> That's what I thought, but wanted to double check. This is PREEMPT_RT
> Linux, not Xenomai, on x86_64 stock Dell hardware.
>
> I agree - I think J is running in between, but I'm having trouble determing
> _how_ that can be happening. We'll keep digging ...
>
> As a matter of interest, can anything interesting happen (eg blocking,
> yielding), when a component with a Master activity sequences calls to a
> series of Slave update() functions in a state machine? Is anything unusual
> happening within the slave activity implementation?

No. It's trivial C++ code. You should check that no Logger::In and Logger::log
statements are used in your components. I still think that user code is
causing it... Are page faults/malloc() leading to context switches ? I thought
not, but if it is the case, then any malloc() is suspect...

>
> >> There was a bug in DataObjectLockFree that one of lock-free regions
> >> was not properly initialized and which caused a single mode switch (no
> >> data corruption). That patch is on trunk as well. But if you have it
> >> frequently, I would suggest that the priority inversion comes from
> >> somewhere else, and that in fact, Master is not running as good as you
> >> think.
> >
> > Which brings me to me eternal mantra: "Never use priorities to guarantee
> > Coordination": it's the _logic_ in your application that is responsible
> > for guaranteed Coordination. (In case you need such garanties.)
> > Priorities can help to improve the _performance_ of the Coordination,
> > but can/should never be expected to _realise_ it.
>
> So can you outline how you would solve the above with logic in RTT v1 then?

As Herman puts it, only a very strict architecture can guarantee what you
need. You'll need to create one additional component 'X' that talks to J only
and which provides the data of J to A and B. You let X run with a slave
activity too by the same master of A and B. Then you'll always have the right
copy.
Not that much work I would think...and guaranteed to work.

Peter

Multiple port readers

On May 20, 2011, at 05:09 , Peter Soetens wrote:

> On Wednesday 18 May 2011 13:14:51 Stephen Roderick wrote:
>> On May 18, 2011, at 04:53 , Herman Bruyninckx wrote:
>>> On Wed, 18 May 2011, Peter Soetens wrote:
>>>> On Wed, May 18, 2011 at 2:11 AM, S Roderick <kiwi [dot] net [..] ...> wrote:
>>>>> We are wondering what is the precise sequence of events in RTT v1, when
>>>>> one component writes to a data port, and more the one other components
>>>>> read from the same port/connection?
>>>>>
>>>>> We have a coordinating component (Master activity) with three
>>>>> sub-components (Slave activities) that it calls in order on a given
>>>>> cycle: A, B, and then C. The coordinating component has the highest
>>>>> importance/priority in our system (real time Linux). A less important
>>>>> component J produces an output (the writer for the data port) that A
>>>>> and C read (the two readers of the data port). Trouble is, A and C
>>>>> occasionally get different values on a given cycle (we can see this in
>>>>> our logs), suggesting that J is running in between, or that its data
>>>>> differs somehow between when A and C read it. This is a real artifact,
>>>>> as it turns into visible vibration in hardware ... :-(
>>>>>
>>>>> We don't think this has anything to do with A or B blocking, therefore
>>>>> allowing J to run in between A and C. We also don't think that
>>>>> priority inversion is involved.
>>>>>
>>>>> Which brings me back to the question - how exactly is data moved from
>>>>> the writer of a data port to more than one readers of said port? The
>>>>> data in question is a pre-allocated std::vector<double>, in case it
>>>>> has any bearing.
>>>>
>>>> Connections in 1.x are fairly simple. You can visualise it as a
>>>> central container that holds the data, and any read/write port has a
>>>> pointer to that container and can read or write it. The container
>>>> takes care of the thread-safe read/write of data. So data isn't moved
>>>> around that mutch. I think J is indeed running in between and that it
>>>> causes this difference. What we do to be sure that a high-prio
>>>> component is not interrupted, we run it in a Xenomai primary domain
>>>> thread and look for mode switches. Which target are you using btw ?
>>
>> That's what I thought, but wanted to double check. This is PREEMPT_RT
>> Linux, not Xenomai, on x86_64 stock Dell hardware.
>>
>> I agree - I think J is running in between, but I'm having trouble determing
>> _how_ that can be happening. We'll keep digging ...
>>
>> As a matter of interest, can anything interesting happen (eg blocking,
>> yielding), when a component with a Master activity sequences calls to a
>> series of Slave update() functions in a state machine? Is anything unusual
>> happening within the slave activity implementation?
>
> No. It's trivial C++ code. You should check that no Logger::In and Logger::log
> statements are used in your components. I still think that user code is
> causing it... Are page faults/malloc() leading to context switches ? I thought
> not, but if it is the case, then any malloc() is suspect...

I thought so.

And no ... we ripped out all Logger::xxx related stuff years ago.

And we have diagnostics throughout our code tracking page faults. Been caught by that before ... :-)

It is certainly worth double-checking the above two things though, I will go and do that.

>>>> There was a bug in DataObjectLockFree that one of lock-free regions
>>>> was not properly initialized and which caused a single mode switch (no
>>>> data corruption). That patch is on trunk as well. But if you have it
>>>> frequently, I would suggest that the priority inversion comes from
>>>> somewhere else, and that in fact, Master is not running as good as you
>>>> think.
>>>
>>> Which brings me to me eternal mantra: "Never use priorities to guarantee
>>> Coordination": it's the _logic_ in your application that is responsible
>>> for guaranteed Coordination. (In case you need such garanties.)
>>> Priorities can help to improve the _performance_ of the Coordination,
>>> but can/should never be expected to _realise_ it.
>>
>> So can you outline how you would solve the above with logic in RTT v1 then?
>
> As Herman puts it, only a very strict architecture can guarantee what you
> need. You'll need to create one additional component 'X' that talks to J only
> and which provides the data of J to A and B. You let X run with a slave
> activity too by the same master of A and B. Then you'll always have the right
> copy.
> Not that much work I would think...and guaranteed to work.

Hmmm ... so X is a peer of A and B under their coordinator. I don't see how this really helps though - it only means that A and B would get consistent data, right? It wouldn't solve the fundamental problem of needing the sequence to always be J, A, B, J, A, B, ... or am I missing something?

I agree that our use of priorities was a bit of a shortcut to try and guarantee things - funny thing is, we've gotten away with it for years ... sigh ... the nature of realtime systems ...
S

Multiple port readers

On Fri, 20 May 2011, Stephen Roderick wrote:

> On May 20, 2011, at 05:09 , Peter Soetens wrote:
>
>> On Wednesday 18 May 2011 13:14:51 Stephen Roderick wrote:
>>> On May 18, 2011, at 04:53 , Herman Bruyninckx wrote:
>>>> On Wed, 18 May 2011, Peter Soetens wrote:
>>>>> On Wed, May 18, 2011 at 2:11 AM, S Roderick <kiwi [dot] net [..] ...> wrote:
>>>>>> We are wondering what is the precise sequence of events in RTT v1, when
>>>>>> one component writes to a data port, and more the one other components
>>>>>> read from the same port/connection?
>>>>>>
>>>>>> We have a coordinating component (Master activity) with three
>>>>>> sub-components (Slave activities) that it calls in order on a given
>>>>>> cycle: A, B, and then C. The coordinating component has the highest
>>>>>> importance/priority in our system (real time Linux). A less important
>>>>>> component J produces an output (the writer for the data port) that A
>>>>>> and C read (the two readers of the data port). Trouble is, A and C
>>>>>> occasionally get different values on a given cycle (we can see this in
>>>>>> our logs), suggesting that J is running in between, or that its data
>>>>>> differs somehow between when A and C read it. This is a real artifact,
>>>>>> as it turns into visible vibration in hardware ... :-(
>>>>>>
>>>>>> We don't think this has anything to do with A or B blocking, therefore
>>>>>> allowing J to run in between A and C. We also don't think that
>>>>>> priority inversion is involved.
>>>>>>
>>>>>> Which brings me back to the question - how exactly is data moved from
>>>>>> the writer of a data port to more than one readers of said port? The
>>>>>> data in question is a pre-allocated std::vector<double>, in case it
>>>>>> has any bearing.
>>>>>
>>>>> Connections in 1.x are fairly simple. You can visualise it as a
>>>>> central container that holds the data, and any read/write port has a
>>>>> pointer to that container and can read or write it. The container
>>>>> takes care of the thread-safe read/write of data. So data isn't moved
>>>>> around that mutch. I think J is indeed running in between and that it
>>>>> causes this difference. What we do to be sure that a high-prio
>>>>> component is not interrupted, we run it in a Xenomai primary domain
>>>>> thread and look for mode switches. Which target are you using btw ?
>>>
>>> That's what I thought, but wanted to double check. This is PREEMPT_RT
>>> Linux, not Xenomai, on x86_64 stock Dell hardware.
>>>
>>> I agree - I think J is running in between, but I'm having trouble determing
>>> _how_ that can be happening. We'll keep digging ...
>>>
>>> As a matter of interest, can anything interesting happen (eg blocking,
>>> yielding), when a component with a Master activity sequences calls to a
>>> series of Slave update() functions in a state machine? Is anything unusual
>>> happening within the slave activity implementation?
>>
>> No. It's trivial C++ code. You should check that no Logger::In and Logger::log
>> statements are used in your components. I still think that user code is
>> causing it... Are page faults/malloc() leading to context switches ? I thought
>> not, but if it is the case, then any malloc() is suspect...
>
> I thought so.
>
> And no ... we ripped out all Logger::xxx related stuff years ago.
>
> And we have diagnostics throughout our code tracking page faults. Been caught by that before ... :-)
>
> It is certainly worth double-checking the above two things though, I will go and do that.
>
>>>>> There was a bug in DataObjectLockFree that one of lock-free regions
>>>>> was not properly initialized and which caused a single mode switch (no
>>>>> data corruption). That patch is on trunk as well. But if you have it
>>>>> frequently, I would suggest that the priority inversion comes from
>>>>> somewhere else, and that in fact, Master is not running as good as you
>>>>> think.
>>>>
>>>> Which brings me to me eternal mantra: "Never use priorities to guarantee
>>>> Coordination": it's the _logic_ in your application that is responsible
>>>> for guaranteed Coordination. (In case you need such garanties.)
>>>> Priorities can help to improve the _performance_ of the Coordination,
>>>> but can/should never be expected to _realise_ it.
>>>
>>> So can you outline how you would solve the above with logic in RTT v1 then?
>>
>> As Herman puts it, only a very strict architecture can guarantee what you
>> need. You'll need to create one additional component 'X' that talks to J only
>> and which provides the data of J to A and B. You let X run with a slave
>> activity too by the same master of A and B. Then you'll always have the right
>> copy.
>> Not that much work I would think...and guaranteed to work.
>
> Hmmm ... so X is a peer of A and B under their coordinator. I don't see how this really helps though - it only means that A and B would get consistent data, right? It wouldn't solve the fundamental problem of needing the sequence to always be J, A, B, J, A, B, ... or am I missing something?
>
> I agree that our use of priorities was a bit of a shortcut to try and guarantee things - funny thing is, we've gotten away with it for years ... sigh ... the nature of realtime systems ...

So, I hope you have learned your lesson! The same one I also learned
_after_ having been bitten (severely, and several times) by these
"priorities will do the job, for now" attitude to systems design :-) Never
again!

> S

Herman

Multiple port readers

On Friday 20 May 2011 14:24:56 Stephen Roderick wrote:
> On May 20, 2011, at 05:09 , Peter Soetens wrote:
> > On Wednesday 18 May 2011 13:14:51 Stephen Roderick wrote:
> >> On May 18, 2011, at 04:53 , Herman Bruyninckx wrote:
> >>> On Wed, 18 May 2011, Peter Soetens wrote:
> >>>> On Wed, May 18, 2011 at 2:11 AM, S Roderick <kiwi [dot] net [..] ...> wrote:
> >>>>> We are wondering what is the precise sequence of events in RTT v1,
> >>>>> when one component writes to a data port, and more the one other
> >>>>> components read from the same port/connection?
> >>>>>
> >>>>> We have a coordinating component (Master activity) with three
> >>>>> sub-components (Slave activities) that it calls in order on a given
> >>>>> cycle: A, B, and then C. The coordinating component has the highest
> >>>>> importance/priority in our system (real time Linux). A less important
> >>>>> component J produces an output (the writer for the data port) that A
> >>>>> and C read (the two readers of the data port). Trouble is, A and C
> >>>>> occasionally get different values on a given cycle (we can see this
> >>>>> in our logs), suggesting that J is running in between, or that its
> >>>>> data differs somehow between when A and C read it. This is a real
> >>>>> artifact, as it turns into visible vibration in hardware ... :-(
> >>>>>
> >>>>> We don't think this has anything to do with A or B blocking,
> >>>>> therefore allowing J to run in between A and C. We also don't think
> >>>>> that priority inversion is involved.
> >>>>>
> >>>>> Which brings me back to the question - how exactly is data moved from
> >>>>> the writer of a data port to more than one readers of said port? The
> >>>>> data in question is a pre-allocated std::vector<double>, in case it
> >>>>> has any bearing.
> >>>>
> >>>> Connections in 1.x are fairly simple. You can visualise it as a
> >>>> central container that holds the data, and any read/write port has a
> >>>> pointer to that container and can read or write it. The container
> >>>> takes care of the thread-safe read/write of data. So data isn't moved
> >>>> around that mutch. I think J is indeed running in between and that it
> >>>> causes this difference. What we do to be sure that a high-prio
> >>>> component is not interrupted, we run it in a Xenomai primary domain
> >>>> thread and look for mode switches. Which target are you using btw ?
> >>
> >> That's what I thought, but wanted to double check. This is PREEMPT_RT
> >> Linux, not Xenomai, on x86_64 stock Dell hardware.
> >>
> >> I agree - I think J is running in between, but I'm having trouble
> >> determing _how_ that can be happening. We'll keep digging ...
> >>
> >> As a matter of interest, can anything interesting happen (eg blocking,
> >> yielding), when a component with a Master activity sequences calls to a
> >> series of Slave update() functions in a state machine? Is anything
> >> unusual happening within the slave activity implementation?
> >
> > No. It's trivial C++ code. You should check that no Logger::In and
> > Logger::log statements are used in your components. I still think that
> > user code is causing it... Are page faults/malloc() leading to context
> > switches ? I thought not, but if it is the case, then any malloc() is
> > suspect...
>
> I thought so.
>
> And no ... we ripped out all Logger::xxx related stuff years ago.
>
> And we have diagnostics throughout our code tracking page faults. Been
> caught by that before ... :-)
>
> It is certainly worth double-checking the above two things though, I will
> go and do that.
>
> >>>> There was a bug in DataObjectLockFree that one of lock-free regions
> >>>> was not properly initialized and which caused a single mode switch (no
> >>>> data corruption). That patch is on trunk as well. But if you have it
> >>>> frequently, I would suggest that the priority inversion comes from
> >>>> somewhere else, and that in fact, Master is not running as good as you
> >>>> think.
> >>>
> >>> Which brings me to me eternal mantra: "Never use priorities to
> >>> guarantee Coordination": it's the _logic_ in your application that is
> >>> responsible for guaranteed Coordination. (In case you need such
> >>> garanties.) Priorities can help to improve the _performance_ of the
> >>> Coordination, but can/should never be expected to _realise_ it.
> >>
> >> So can you outline how you would solve the above with logic in RTT v1
> >> then?
> >
> > As Herman puts it, only a very strict architecture can guarantee what you
> > need. You'll need to create one additional component 'X' that talks to J
> > only and which provides the data of J to A and B. You let X run with a
> > slave activity too by the same master of A and B. Then you'll always
> > have the right copy.
> > Not that much work I would think...and guaranteed to work.
>
> Hmmm ... so X is a peer of A and B under their coordinator. I don't see how
> this really helps though - it only means that A and B would get consistent
> data, right?

Yes, I thought that was the purpose in the first place.

> It wouldn't solve the fundamental problem of needing the
> sequence to always be J, A, B, J, A, B, ... or am I missing something?

How can you guarantee that sequence if J is running in a different thread ?
Why is it then running in a different thread?

>
> I agree that our use of priorities was a bit of a shortcut to try and
> guarantee things - funny thing is, we've gotten away with it for years
> ... sigh ... the nature of realtime systems ... S

Peter

Multiple port readers

On Wed, 18 May 2011, Stephen Roderick wrote:

> On May 18, 2011, at 04:53 , Herman Bruyninckx wrote:
>
>> On Wed, 18 May 2011, Peter Soetens wrote:
>>
>>> On Wed, May 18, 2011 at 2:11 AM, S Roderick <kiwi [dot] net [..] ...> wrote:
>>>> We are wondering what is the precise sequence of events in RTT v1, when one component writes to a data port, and more the one other components read from the same port/connection?
>>>>
>>>> We have a coordinating component (Master activity) with three sub-components (Slave activities) that it calls in order on a given cycle: A, B, and then C. The coordinating component has the highest importance/priority in our system (real time Linux). A less important component J produces an output (the writer for the data port) that A and C read (the two readers of the data port). Trouble is, A and C occasionally get different values on a given cycle (we can see this in our logs), suggesting that J is running in between, or that its data differs somehow between when A and C read it. This is a real artifact, as it turns into visible vibration in hardware ... :-(
>>>>
>>>> We don't think this has anything to do with A or B blocking, therefore allowing J to run in between A and C. We also don't think that priority inversion is involved.
>>>>
>>>> Which brings me back to the question - how exactly is data moved from the writer of a data port to more than one readers of said port? The data in question is a pre-allocated std::vector<double>, in case it has any bearing.
>>>
>>> Connections in 1.x are fairly simple. You can visualise it as a
>>> central container that holds the data, and any read/write port has a
>>> pointer to that container and can read or write it. The container
>>> takes care of the thread-safe read/write of data. So data isn't moved
>>> around that mutch. I think J is indeed running in between and that it
>>> causes this difference. What we do to be sure that a high-prio
>>> component is not interrupted, we run it in a Xenomai primary domain
>>> thread and look for mode switches. Which target are you using btw ?
>
> That's what I thought, but wanted to double check. This is PREEMPT_RT Linux, not Xenomai, on x86_64 stock Dell hardware.
>
> I agree - I think J is running in between, but I'm having trouble determing _how_ that can be happening. We'll keep digging ...
>
> As a matter of interest, can anything interesting happen (eg blocking,
> yielding), when a component with a Master activity sequences calls to a
> series of Slave update() functions in a state machine? Is anything
> unusual happening within the slave activity implementation?

One thing I can think of (don't know whether it is relevant to your
particular case...): when a state machine gets active, it could be in a
situation where events or "world states" have occured that allow it to
transition through multiple of its states in one go; there is a _policy_
that can put a maximum on that number of transitions. That means that a
state machine could go put itself to sleep even though (i) it could do some
things, and (ii) its thread has a higher priority than that of another
component.

>>> There was a bug in DataObjectLockFree that one of lock-free regions
>>> was not properly initialized and which caused a single mode switch (no
>>> data corruption). That patch is on trunk as well. But if you have it
>>> frequently, I would suggest that the priority inversion comes from
>>> somewhere else, and that in fact, Master is not running as good as you
>>> think.
>>
>> Which brings me to me eternal mantra: "Never use priorities to guarantee
>> Coordination": it's the _logic_ in your application that is responsible for
>> guaranteed Coordination. (In case you need such garanties.) Priorities can
>> help to improve the _performance_ of the Coordination, but can/should never
>> be expected to _realise_ it.
>
> So can you outline how you would solve the above with logic in RTT v1 then?

I would have to have a more complete view on your application. But, if you
are really keen of having things happening in a particular order, there is
just one choice: collocate them in the same process, and use a data
structure that does the right scheduling for you. If you don't follow this
(rather restrictive) approach and do want to distribute your activities
over multiple components, there will always be 'race conditions' possible
and thinkable; at least, that's my experience...

Herman