Orocos limits under Xenomai

Hello all.

Our current system consists of several tasks running under Xenomai
2.5.4 and Orocos 1.12.1. We are seeing into migrating to the latest
Orocos 2.5.

Anyway, I'm in charge of checking the feasibility of this endeavour
and I decided to start by seeing how can the system handle the "new
stuff".

Test 1: deployer-xenomai with a simple script that creates lots of TaskContext:

for(var int i = 0; i != 100; i=i+1)
{
loadComponent("Task"+i, "TaskContext")
}

deployer blocks after creating about 83 tasks. This amount is, by far,
more than we need. However, check how that number drops when adding
ports and connections...

Test 2: deployer-xenomai with a script that creates lots of TaskPorts,
a TaskContext with two Ports.

class TaskPorts : public RTT::TaskContext
{
public:
TaskPorts(std::string const& name)
: TaskContext(name)
{
this->ports()->addPort("InputPort1", _inputPort1).doc( "" );
this->ports()->addPort("OutputPort1", _outputPort1).doc( "" );
}

RTT::InputPort<double> _inputPort1;
RTT::OutputPort<double> _outputPort1;
}

The script now will also add a connection for each task.

for(var int i = 1; i < 100; i=i+2)
{
loadComponent("TaskPort"+(i-1), "TaskPorts")
loadComponent("TaskPort"+i, "TaskPorts")

connect("TaskPort"+(i-1)+".OutputPort", "TaskPort"+i+".InputPort",
ConnPolicy())
}

Now deployer segfaults after creating 43 tasks. That is, half the
tasks by adding only two ports and one connection per task. With six
ports and three connections per task the number drops to ~20. We
currently have way more tasks than that and far more complex that this
simple cases I present.

I know this is probably Xenomai related, but since there are other
Xenomai users out there I wanted the hear their opinion first. Have
you encountered something similar to this before? If so, how did you
solve it?

By the way, I have also run this with gdb. In all cases, the deployer
hangs/segfaults in Xenomai's libnative after performing a
MutexRecursive lock. I can post the backtraces if you want.

We know that 1.12.1 has it limits too, but they are superior to 2.5. I
guess 2.5 makes a heavier use of resources?

Thanks for your time.

Jordán.
--

Jordán Palacios
Software Engineer

jordan [dot] palacios [..] ...
www.pal-robotics.com

PAL Robotics, S.L.
c/ Pujades 77-79, 4º4ª
08005 Barcelona
Spain
Tel +34 93 414 53 47
Fax +34 93 209 11 09
Skype: jordanpalacios.pal-robotics

Facebook - Twitter - PAL Robotics YouTube Channel

P Antes de imprimir este e-mail piense bien si es necesario hacerlo:
El medioambiente es cosa de todos.

AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos,
pueden contener información privilegiada y/o confidencial que está
dirigida exclusivamente a su destinatario. Si usted recibe este
mensaje y no es el destinatario indicado, o el empleado encargado de
su entrega a dicha persona, por favor, notifíquelo inmediatamente y
remita el mensaje original a la dirección de correo electrónico
indicada. Cualquier copia, uso o distribución no autorizados de esta
comunicación queda estrictamente prohibida.

CONFIDENTIALITY NOTICE: This e-mail and the accompanying document(s)
may contain confidential information which is privileged and intended
only for the individual or entity to whom they are addressed. If you
are not the intended recipient, you are hereby notified that any
disclosure, copying, distribution or use of this e-mail and/or
accompanying document(s) is strictly prohibited. If you have received
this e-mail in error, please immediately notify the sender at the
above e-mail address.

Orocos limits under Xenomai

Hi Jordan,

Thanks for sharing your performance tests with us !

Some things that you should enable during such stress tests:

- Enable debugging in the Xenomai kernel modules (no ipipe debugging).
It will hurt your rt-performance, but will fail as gracefully as
possible
- Also define -DOROSEM_OS_XENO_CHECK as a compilation flag (or use
add_definitions( -DOROSEM_OS_XENO_CHECK ) ) in rtt's toplevel cmake
file
- Increase the TLSF memory pool size when launching the deployer,
although it seems from your report that this is not the issue
- check the output of 'dmesg' for clues
- check if this isn't stack size related. Each component will get a
xenomai thread with a fixed stack size. Depending on your orocos
version, this might be as small as 4kb or as big as 128kb (latest
toolchain-2.5)

And probably as a solution to your problem:
- Set 'Number of registry slots' to 2048 in your 'make menuconfig'
step of the patched Linux kernel

RTT 2.x uses much more mutex objects than 1.x. The default of 512 is
too little for most serious Orocos applications.

Peter

On Fri, Nov 16, 2012 at 1:22 PM, Jordan Palacios
<jordan [dot] palacios [..] ...> wrote:
> Hello all.
>
> Our current system consists of several tasks running under Xenomai
> 2.5.4 and Orocos 1.12.1. We are seeing into migrating to the latest
> Orocos 2.5.
>
> Anyway, I'm in charge of checking the feasibility of this endeavour
> and I decided to start by seeing how can the system handle the "new
> stuff".
>
>
> Test 1: deployer-xenomai with a simple script that creates lots of TaskContext:
>
> for(var int i = 0; i != 100; i=i+1)
> {
> loadComponent("Task"+i, "TaskContext")
> }
>
> deployer blocks after creating about 83 tasks. This amount is, by far,
> more than we need. However, check how that number drops when adding
> ports and connections...
>
>
> Test 2: deployer-xenomai with a script that creates lots of TaskPorts,
> a TaskContext with two Ports.
>
> class TaskPorts : public RTT::TaskContext
> {
> public:
> TaskPorts(std::string const& name)
> : TaskContext(name)
> {
> this->ports()->addPort("InputPort1", _inputPort1).doc( "" );
> this->ports()->addPort("OutputPort1", _outputPort1).doc( "" );
> }
>
> RTT::InputPort<double> _inputPort1;
> RTT::OutputPort<double> _outputPort1;
> }
>
> The script now will also add a connection for each task.
>
> for(var int i = 1; i < 100; i=i+2)
> {
> loadComponent("TaskPort"+(i-1), "TaskPorts")
> loadComponent("TaskPort"+i, "TaskPorts")
>
> connect("TaskPort"+(i-1)+".OutputPort", "TaskPort"+i+".InputPort",
> ConnPolicy())
> }
>
> Now deployer segfaults after creating 43 tasks. That is, half the
> tasks by adding only two ports and one connection per task. With six
> ports and three connections per task the number drops to ~20. We
> currently have way more tasks than that and far more complex that this
> simple cases I present.
>
> I know this is probably Xenomai related, but since there are other
> Xenomai users out there I wanted the hear their opinion first. Have
> you encountered something similar to this before? If so, how did you
> solve it?
>
> By the way, I have also run this with gdb. In all cases, the deployer
> hangs/segfaults in Xenomai's libnative after performing a
> MutexRecursive lock. I can post the backtraces if you want.
>
> We know that 1.12.1 has it limits too, but they are superior to 2.5. I
> guess 2.5 makes a heavier use of resources?
>
> Thanks for your time.
>
> Jordán.
> --
>
> Jordán Palacios
> Software Engineer
>
> jordan [dot] palacios [..] ...
> www.pal-robotics.com
>
> PAL Robotics, S.L.
> c/ Pujades 77-79, 4º4ª
> 08005 Barcelona
> Spain
> Tel +34 93 414 53 47
> Fax +34 93 209 11 09
> Skype: jordanpalacios.pal-robotics
>
> Facebook - Twitter - PAL Robotics YouTube Channel
>
> P Antes de imprimir este e-mail piense bien si es necesario hacerlo:
> El medioambiente es cosa de todos.
>
> AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos,
> pueden contener información privilegiada y/o confidencial que está
> dirigida exclusivamente a su destinatario. Si usted recibe este
> mensaje y no es el destinatario indicado, o el empleado encargado de
> su entrega a dicha persona, por favor, notifíquelo inmediatamente y
> remita el mensaje original a la dirección de correo electrónico
> indicada. Cualquier copia, uso o distribución no autorizados de esta
> comunicación queda estrictamente prohibida.
>
> CONFIDENTIALITY NOTICE: This e-mail and the accompanying document(s)
> may contain confidential information which is privileged and intended
> only for the individual or entity to whom they are addressed. If you
> are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of this e-mail and/or
> accompanying document(s) is strictly prohibited. If you have received
> this e-mail in error, please immediately notify the sender at the
> above e-mail address.
> --
> Orocos-Users mailing list
> Orocos-Users [..] ...
> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users

Orocos limits under Xenomai

Hi Peter.

Thanks for your answers.

On 17 November 2012 15:30, Peter Soetens <peter [..] ...> wrote:

> Hi Jordan,
>
> Thanks for sharing your performance tests with us !
>
> Some things that you should enable during such stress tests:
>
> - Enable debugging in the Xenomai kernel modules (no ipipe debugging).
> It will hurt your rt-performance, but will fail as gracefully as
> possible
> - Also define -DOROSEM_OS_XENO_CHECK as a compilation flag (or use
> add_definitions( -DOROSEM_OS_XENO_CHECK ) ) in rtt's toplevel cmake
> file
>

Added. Control of task creation seems better with this (no segfault).
TaskBrowser still blocks.

[...]
0.613 [ ERROR ][DeploymentComponent::loadComponent] The constructor of
component type TaskContext threw an exception!
0.613 [ ERROR ][DeploymentComponent::loadComponent] Failed to load
component with name Task97: refused to be created.
0.613 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER,
priority=1, CPU affinity=0, with name='Task98'
0.614 [ ERROR ][Thread] Task98 : CANNOT INIT Xeno TASK Task98 error code:
-12
0.614 [CRITICAL][Thread] Could not create thread Task98.
0.614 [ ERROR ][DeploymentComponent::loadComponent] The constructor of
component type TaskContext threw an exception!
0.614 [ ERROR ][DeploymentComponent::loadComponent] Failed to load
component with name Task98: refused to be created.
0.614 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER,
priority=1, CPU affinity=0, with name='Task99'
0.614 [ ERROR ][Thread] Task99 : CANNOT INIT Xeno TASK Task99 error code:
-12
0.614 [CRITICAL][Thread] Could not create thread Task99.
0.614 [ ERROR ][DeploymentComponent::loadComponent] The constructor of
component type TaskContext threw an exception!
0.614 [ ERROR ][DeploymentComponent::loadComponent] Failed to load
component with name Task99: refused to be created.
0.615 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER,
priority=1, CPU affinity=0, with name='TaskBrowser'
0.615 [ ERROR ][Thread] TaskBrowser : CANNOT INIT Xeno TASK TaskBrowser
error code: -12
0.615 [CRITICAL][Thread] Could not create thread TaskBrowser.
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

> - Increase the TLSF memory pool size when launching the deployer,
> although it seems from your report that this is not the issue
>

Tried already. No effect in the result that I could observe.

> - check the output of 'dmesg' for clues
>

Checked. Before adding the OROSEM_OS_XENO_CHECK, the segfault was being
detected. After that Xenomai proceeded to clean all the allocated resources
(cond, sems, mutexes).

> - check if this isn't stack size related. Each component will get a
> xenomai thread with a fixed stack size. Depending on your orocos
> version, this might be as small as 4kb or as big as 128kb (latest
> toolchain-2.5)
>
> And probably as a solution to your problem:
> - Set 'Number of registry slots' to 2048 in your 'make menuconfig'
> step of the patched Linux kernel
>

And we have a winner!

With this the limits can be raised. Now is up to us to the find the value
that suits our needs.

I'm only concerned in the possible performance impact of changing this
value since I don't think is "free". Any thoughts on this?

Thanks again Peter.

Jordán.

> RTT 2.x uses much more mutex objects than 1.x. The default of 512 is
> too little for most serious Orocos applications.
>
> Peter
>
> On Fri, Nov 16, 2012 at 1:22 PM, Jordan Palacios
> <jordan [dot] palacios [..] ...> wrote:
> > Hello all.
> >
> > Our current system consists of several tasks running under Xenomai
> > 2.5.4 and Orocos 1.12.1. We are seeing into migrating to the latest
> > Orocos 2.5.
> >
> > Anyway, I'm in charge of checking the feasibility of this endeavour
> > and I decided to start by seeing how can the system handle the "new
> > stuff".
> >
> >
> > Test 1: deployer-xenomai with a simple script that creates lots of
> TaskContext:
> >
> > for(var int i = 0; i != 100; i=i+1)
> > {
> > loadComponent("Task"+i, "TaskContext")
> > }
> >
> > deployer blocks after creating about 83 tasks. This amount is, by far,
> > more than we need. However, check how that number drops when adding
> > ports and connections...
> >
> >
> > Test 2: deployer-xenomai with a script that creates lots of TaskPorts,
> > a TaskContext with two Ports.
> >
> > class TaskPorts : public RTT::TaskContext
> > {
> > public:
> > TaskPorts(std::string const& name)
> > : TaskContext(name)
> > {
> > this->ports()->addPort("InputPort1", _inputPort1).doc( "" );
> > this->ports()->addPort("OutputPort1", _outputPort1).doc( "" );
> > }
> >
> > RTT::InputPort<double> _inputPort1;
> > RTT::OutputPort<double> _outputPort1;
> > }
> >
> > The script now will also add a connection for each task.
> >
> > for(var int i = 1; i < 100; i=i+2)
> > {
> > loadComponent("TaskPort"+(i-1), "TaskPorts")
> > loadComponent("TaskPort"+i, "TaskPorts")
> >
> > connect("TaskPort"+(i-1)+".OutputPort", "TaskPort"+i+".InputPort",
> > ConnPolicy())
> > }
> >
> > Now deployer segfaults after creating 43 tasks. That is, half the
> > tasks by adding only two ports and one connection per task. With six
> > ports and three connections per task the number drops to ~20. We
> > currently have way more tasks than that and far more complex that this
> > simple cases I present.
> >
> > I know this is probably Xenomai related, but since there are other
> > Xenomai users out there I wanted the hear their opinion first. Have
> > you encountered something similar to this before? If so, how did you
> > solve it?
> >
> > By the way, I have also run this with gdb. In all cases, the deployer
> > hangs/segfaults in Xenomai's libnative after performing a
> > MutexRecursive lock. I can post the backtraces if you want.
> >
> > We know that 1.12.1 has it limits too, but they are superior to 2.5. I
> > guess 2.5 makes a heavier use of resources?
> >
> > Thanks for your time.
> >
> > Jordán.
> > --
> >
> > Jordán Palacios
> > Software Engineer
> >
> > jordan [dot] palacios [..] ...
> > www.pal-robotics.com
> >
> > PAL Robotics, S.L.
> > c/ Pujades 77-79, 4º4ª
> > 08005 Barcelona
> > Spain
> > Tel +34 93 414 53 47
> > Fax +34 93 209 11 09
> > Skype: jordanpalacios.pal-robotics
> >
> > Facebook - Twitter - PAL Robotics YouTube Channel
> >
> > P Antes de imprimir este e-mail piense bien si es necesario hacerlo:
> > El medioambiente es cosa de todos.
> >
> > AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos,
> > pueden contener información privilegiada y/o confidencial que está
> > dirigida exclusivamente a su destinatario. Si usted recibe este
> > mensaje y no es el destinatario indicado, o el empleado encargado de
> > su entrega a dicha persona, por favor, notifíquelo inmediatamente y
> > remita el mensaje original a la dirección de correo electrónico
> > indicada. Cualquier copia, uso o distribución no autorizados de esta
> > comunicación queda estrictamente prohibida.
> >
> > CONFIDENTIALITY NOTICE: This e-mail and the accompanying document(s)
> > may contain confidential information which is privileged and intended
> > only for the individual or entity to whom they are addressed. If you
> > are not the intended recipient, you are hereby notified that any
> > disclosure, copying, distribution or use of this e-mail and/or
> > accompanying document(s) is strictly prohibited. If you have received
> > this e-mail in error, please immediately notify the sender at the
> > above e-mail address.
> > --
> > Orocos-Users mailing list
> > Orocos-Users [..] ...
> > http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users
>

Orocos limits under Xenomai

On Mon, Nov 19, 2012 at 12:38 PM, Jordan Palacios
<jordan [dot] palacios [..] ...> wrote:
>
> Hi Peter.
>
> Thanks for your answers.
>
> On 17 November 2012 15:30, Peter Soetens <peter [..] ...> wrote:
>>
>> Hi Jordan,
>>
>> Thanks for sharing your performance tests with us !
>>
>> Some things that you should enable during such stress tests:
>>
>> - Enable debugging in the Xenomai kernel modules (no ipipe debugging).
>> It will hurt your rt-performance, but will fail as gracefully as
>> possible
>> - Also define -DOROSEM_OS_XENO_CHECK as a compilation flag (or use
>> add_definitions( -DOROSEM_OS_XENO_CHECK ) ) in rtt's toplevel cmake
>> file
>
>
> Added. Control of task creation seems better with this (no segfault).
> TaskBrowser still blocks.
>
> [...]
> 0.613 [ ERROR ][DeploymentComponent::loadComponent] The constructor of
> component type TaskContext threw an exception!
> 0.613 [ ERROR ][DeploymentComponent::loadComponent] Failed to load
> component with name Task97: refused to be created.
> 0.613 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER,
> priority=1, CPU affinity=0, with name='Task98'
> 0.614 [ ERROR ][Thread] Task98 : CANNOT INIT Xeno TASK Task98 error code:
> -12
> 0.614 [CRITICAL][Thread] Could not create thread Task98.
> 0.614 [ ERROR ][DeploymentComponent::loadComponent] The constructor of
> component type TaskContext threw an exception!
> 0.614 [ ERROR ][DeploymentComponent::loadComponent] Failed to load
> component with name Task98: refused to be created.
> 0.614 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER,
> priority=1, CPU affinity=0, with name='Task99'
> 0.614 [ ERROR ][Thread] Task99 : CANNOT INIT Xeno TASK Task99 error code:
> -12
> 0.614 [CRITICAL][Thread] Could not create thread Task99.
> 0.614 [ ERROR ][DeploymentComponent::loadComponent] The constructor of
> component type TaskContext threw an exception!
> 0.614 [ ERROR ][DeploymentComponent::loadComponent] Failed to load
> component with name Task99: refused to be created.
> 0.615 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER,
> priority=1, CPU affinity=0, with name='TaskBrowser'
> 0.615 [ ERROR ][Thread] TaskBrowser : CANNOT INIT Xeno TASK TaskBrowser
> error code: -12
> 0.615 [CRITICAL][Thread] Could not create thread TaskBrowser.
> terminate called after throwing an instance of 'std::bad_alloc'
> what(): std::bad_alloc
>
>
>>
>> - Increase the TLSF memory pool size when launching the deployer,
>> although it seems from your report that this is not the issue
>
>
> Tried already. No effect in the result that I could observe.
>
>>
>> - check the output of 'dmesg' for clues
>
>
> Checked. Before adding the OROSEM_OS_XENO_CHECK, the segfault was being
> detected. After that Xenomai proceeded to clean all the allocated resources
> (cond, sems, mutexes).
>
>>
>> - check if this isn't stack size related. Each component will get a
>> xenomai thread with a fixed stack size. Depending on your orocos
>> version, this might be as small as 4kb or as big as 128kb (latest
>> toolchain-2.5)
>>
>> And probably as a solution to your problem:
>> - Set 'Number of registry slots' to 2048 in your 'make menuconfig'
>> step of the patched Linux kernel
>
>
> And we have a winner!
>
> With this the limits can be raised. Now is up to us to the find the value
> that suits our needs.
>
> I'm only concerned in the possible performance impact of changing this
> value since I don't think is "free". Any thoughts on this?
>

Did not check this, but you should know that Xenomai defaults have
always been conservative to the low end, and that the Xenomai
developers always put the responsibility on our shoulders to seek out
sane defaults. So I don't think that this 512 value was chosen out of
'performance' reasons, but, could have been any number. Since this
table is used for almost every user space Xenomai function, I'm
assuming they optimised the hell out of it anyway.

Peter

Orocos limits under Xenomai

2012/11/16 Jordan Palacios <jordan [dot] palacios [..] ...>

> Hello all.
>
> Our current system consists of several tasks running under Xenomai
> 2.5.4 and Orocos 1.12.1. We are seeing into migrating to the latest
> Orocos 2.5.
>
> Anyway, I'm in charge of checking the feasibility of this endeavour
> and I decided to start by seeing how can the system handle the "new
> stuff".
>
>
> Test 1: deployer-xenomai with a simple script that creates lots of
> TaskContext:
>
> for(var int i = 0; i != 100; i=i+1)
> {
> loadComponent("Task"+i, "TaskContext")
> }
>
> deployer blocks after creating about 83 tasks. This amount is, by far,
> more than we need. However, check how that number drops when adding
> ports and connections...
>
>
> Test 2: deployer-xenomai with a script that creates lots of TaskPorts,
> a TaskContext with two Ports.
>
> class TaskPorts : public RTT::TaskContext
> {
> public:
> TaskPorts(std::string const& name)
> : TaskContext(name)
> {
> this->ports()->addPort("InputPort1", _inputPort1).doc( "" );
> this->ports()->addPort("OutputPort1", _outputPort1).doc( "" );
> }
>
> RTT::InputPort<double> _inputPort1;
> RTT::OutputPort<double> _outputPort1;
> }
>
> The script now will also add a connection for each task.
>
> for(var int i = 1; i < 100; i=i+2)
> {
> loadComponent("TaskPort"+(i-1), "TaskPorts")
> loadComponent("TaskPort"+i, "TaskPorts")
>
> connect("TaskPort"+(i-1)+".OutputPort", "TaskPort"+i+".InputPort",
> ConnPolicy())
> }
>
> Now deployer segfaults after creating 43 tasks. That is, half the
> tasks by adding only two ports and one connection per task. With six
> ports and three connections per task the number drops to ~20. We
> currently have way more tasks than that and far more complex that this
> simple cases I present.
>
> I know this is probably Xenomai related, but since there are other
> Xenomai users out there I wanted the hear their opinion first. Have
> you encountered something similar to this before? If so, how did you
> solve it?
>

I personnaly had hard time with Orocos under Xenomai in 2.5. I think there
was some regressions with prior 2.X versions, (can't remind which :( ) bug
I did not get the time to post problems and stay with gnulinux.

Did you do the same tests under gnulinux to compare if it is related to the
switch from 1.12 to 2.5 or from gnulinux to xenomai ?
Where you already using deployer with script ? (or with xml files ?)

Here is some question to Orocos developers to help finding some clues :
Is it possible that new kernel option are required in xenomai with the 2.5
version ? (heap/stack size and so on ...)
Is it possible that some new stuff on script concurency protection is
related to that ? I think they had been some improvement in this domain
since v1.X

Whatever happened here, you should have a gentle failure, not a segfault.
Either Xenomai, Orocos, or your ABI is faulty.

> By the way, I have also run this with gdb. In all cases, the deployer
> hangs/segfaults in Xenomai's libnative after performing a
> MutexRecursive lock. I can post the backtraces if you want.
>
> We know that 1.12.1 has it limits too, but they are superior to 2.5. I
> guess 2.5 makes a heavier use of resources?
>
> Thanks for your time.
>
> Jordán.
> --
>
> Jordán Palacios
> Software Engineer
>
> jordan [dot] palacios [..] ...
> www.pal-robotics.com
>
> PAL Robotics, S.L.
> c/ Pujades 77-79, 4º4ª
> 08005 Barcelona
> Spain
> Tel +34 93 414 53 47
> Fax +34 93 209 11 09
> Skype: jordanpalacios.pal-robotics
>
> Facebook - Twitter - PAL Robotics YouTube Channel
>
> P Antes de imprimir este e-mail piense bien si es necesario hacerlo:
> El medioambiente es cosa de todos.
>
> AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos,
> pueden contener información privilegiada y/o confidencial que está
> dirigida exclusivamente a su destinatario. Si usted recibe este
> mensaje y no es el destinatario indicado, o el empleado encargado de
> su entrega a dicha persona, por favor, notifíquelo inmediatamente y
> remita el mensaje original a la dirección de correo electrónico
> indicada. Cualquier copia, uso o distribución no autorizados de esta
> comunicación queda estrictamente prohibida.
>
> CONFIDENTIALITY NOTICE: This e-mail and the accompanying document(s)
> may contain confidential information which is privileged and intended
> only for the individual or entity to whom they are addressed. If you
> are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of this e-mail and/or
> accompanying document(s) is strictly prohibited. If you have received
> this e-mail in error, please immediately notify the sender at the
> above e-mail address.
> --
> Orocos-Users mailing list
> Orocos-Users [..] ...
> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users
>

Orocos limits under Xenomai

Hi Willy.

Thanks for your comments.

On 16 November 2012 13:50, Willy Lambert <lambert [dot] willy [..] ...> wrote:

>
>
> 2012/11/16 Jordan Palacios <jordan [dot] palacios [..] ...>
>
>> Hello all.
>>
>> Our current system consists of several tasks running under Xenomai
>> 2.5.4 and Orocos 1.12.1. We are seeing into migrating to the latest
>> Orocos 2.5.
>>
>> Anyway, I'm in charge of checking the feasibility of this endeavour
>> and I decided to start by seeing how can the system handle the "new
>> stuff".
>>
>>
>> Test 1: deployer-xenomai with a simple script that creates lots of
>> TaskContext:
>>
>> for(var int i = 0; i != 100; i=i+1)
>> {
>> loadComponent("Task"+i, "TaskContext")
>> }
>>
>> deployer blocks after creating about 83 tasks. This amount is, by far,
>> more than we need. However, check how that number drops when adding
>> ports and connections...
>>
>>
>> Test 2: deployer-xenomai with a script that creates lots of TaskPorts,
>> a TaskContext with two Ports.
>>
>> class TaskPorts : public RTT::TaskContext
>> {
>> public:
>> TaskPorts(std::string const& name)
>> : TaskContext(name)
>> {
>> this->ports()->addPort("InputPort1", _inputPort1).doc( "" );
>> this->ports()->addPort("OutputPort1", _outputPort1).doc( "" );
>> }
>>
>> RTT::InputPort<double> _inputPort1;
>> RTT::OutputPort<double> _outputPort1;
>> }
>>
>> The script now will also add a connection for each task.
>>
>> for(var int i = 1; i < 100; i=i+2)
>> {
>> loadComponent("TaskPort"+(i-1), "TaskPorts")
>> loadComponent("TaskPort"+i, "TaskPorts")
>>
>> connect("TaskPort"+(i-1)+".OutputPort", "TaskPort"+i+".InputPort",
>> ConnPolicy())
>> }
>>
>> Now deployer segfaults after creating 43 tasks. That is, half the
>> tasks by adding only two ports and one connection per task. With six
>> ports and three connections per task the number drops to ~20. We
>> currently have way more tasks than that and far more complex that this
>> simple cases I present.
>>
>> I know this is probably Xenomai related, but since there are other
>> Xenomai users out there I wanted the hear their opinion first. Have
>> you encountered something similar to this before? If so, how did you
>> solve it?
>>
>
> I personnaly had hard time with Orocos under Xenomai in 2.5. I think there
> was some regressions with prior 2.X versions, (can't remind which :( ) bug
> I did not get the time to post problems and stay with gnulinux.
>
> Did you do the same tests under gnulinux to compare if it is related to
> the switch from 1.12 to 2.5 or from gnulinux to xenomai ?
>

Same tests under gnulinux yield far better results. E.g., being able to
create ~300 tasks for test 2.

I didn't perform these exact same tests for Orocos 1.12, but as I said our
current system handles lots of components with their corresponding
connections. Way more than the limits reported by the tests in Orocos 2.5
at least.

> Where you already using deployer with script ? (or with xml files ?)
>
> Here is some question to Orocos developers to help finding some clues :
> Is it possible that new kernel option are required in xenomai with the 2.5
> version ? (heap/stack size and so on ...)
> Is it possible that some new stuff on script concurency protection is
> related to that ? I think they had been some improvement in this domain
> since v1.X
>
> Whatever happened here, you should have a gentle failure, not a segfault.
> Either Xenomai, Orocos, or your ABI is faulty.
>
>
>> By the way, I have also run this with gdb. In all cases, the deployer
>> hangs/segfaults in Xenomai's libnative after performing a
>> MutexRecursive lock. I can post the backtraces if you want.
>>
>> We know that 1.12.1 has it limits too, but they are superior to 2.5. I
>> guess 2.5 makes a heavier use of resources?
>>
>> Thanks for your time.
>>
>> Jordán.
>> --
>>
>> Jordán Palacios
>> Software Engineer
>>
>> jordan [dot] palacios [..] ...
>> www.pal-robotics.com
>>
>> PAL Robotics, S.L.
>> c/ Pujades 77-79, 4º4ª
>> 08005 Barcelona
>> Spain
>> Tel +34 93 414 53 47
>> Fax +34 93 209 11 09
>> Skype: jordanpalacios.pal-robotics
>>
>> Facebook - Twitter - PAL Robotics YouTube Channel
>>
>> P Antes de imprimir este e-mail piense bien si es necesario hacerlo:
>> El medioambiente es cosa de todos.
>>
>> AVISO DE CONFIDENCIALIDAD: Este mensaje y sus documentos adjuntos,
>> pueden contener información privilegiada y/o confidencial que está
>> dirigida exclusivamente a su destinatario. Si usted recibe este
>> mensaje y no es el destinatario indicado, o el empleado encargado de
>> su entrega a dicha persona, por favor, notifíquelo inmediatamente y
>> remita el mensaje original a la dirección de correo electrónico
>> indicada. Cualquier copia, uso o distribución no autorizados de esta
>> comunicación queda estrictamente prohibida.
>>
>> CONFIDENTIALITY NOTICE: This e-mail and the accompanying document(s)
>> may contain confidential information which is privileged and intended
>> only for the individual or entity to whom they are addressed. If you
>> are not the intended recipient, you are hereby notified that any
>> disclosure, copying, distribution or use of this e-mail and/or
>> accompanying document(s) is strictly prohibited. If you have received
>> this e-mail in error, please immediately notify the sender at the
>> above e-mail address.
>> --
>> Orocos-Users mailing list
>> Orocos-Users [..] ...
>> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users
>>
>
>