[Bug 673] New: Unable mix different CORBA library versions

https://www.fmtc.be/bugzilla/orocos/show_bug.cgi?id=673

Summary: Unable mix different CORBA library versions
Product: RTT
Version: rtt-trunk
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P3
Component: Corba
AssignedTo: orocos-dev [..] ...
ReportedBy: kiwi [dot] net [..] ...
CC: orocos-dev [..] ...
Estimated Hours: 0.0

Warning ... this is a weird one, and sorry that it is so long.

When mixing different versions of the same CORBA library (ACE/TAO), we get
either CORBA exceptions or lockups when getting methods from remote components.
Our system (one deployer and one GUI program) works fine when used on the same
machine. The fault appears to be independant of where the name service is
actually running. We realise that mixing different versions might cause
problems, and we realise that the error may well be in the CORBA library., but
given that only certain Orocos calls cause this we wanted to submit this.

Computers involved: Mac with ACE 5.6.6 (compiled from source), Ubuntu Hardy
with ACE 5.4.7, and Debian Lenny + Xenomai with ACE 5.6.3.

SCENARIO 1: Deployer on Mac and GUI on either Linux computer. Everything works.

SCENARIO 2: Deployer on Ubuntu and GUI attempted to run on Mac.

GUI lockups with the backtrace below. The GUI has already looked up the remote
components it is interested in, connected to the 10+ ports of interest, and is
working through a connecting a set of methods. The fault occurs on the first
method that has a parameter - we rearranged the code and both 1 and 2 parameter
methods cause the lockup. The GUI had already succsesfully connected six (6)
void-parameter methods, though each one does output the

88.061 [ ERROR  ][CorbaFallBackProtocol] Failing Corba::Any creation of type
void.

error message within the deployer. So void-parameter methods aren't a problem.

We get the same lockup when trying the GUI on the Debian+Xenomai computer.

The ctaskbrowser on the Mac can connect and see each component, and it can
successfully call methods with zero, one or more parameter.

Our GUI code used to connect the method is the following (uses a macro):

#define CONNECT_METHOD(_peer, _mtd, _type, _name)    \
    _mtd = _peer->methods()->getMethod<_type>(_name);                    \
    if (!_mtd.ready())                                                    \
    {                                                                    \
        RTT::log(RTT::Error) << "Unable to connect method '"            \
                             << _name << "' in peer '" << _peer->getName() <<
"'" \
                             << RTT::endlog();                            \
        rc = false;                                                        \
    }
 
    CONNECT_METHOD(hmi, goSafe_mtd, void(void), "goSafe");
 
    CONNECT_METHOD(hmi, moveToPose_mtd, void(std::vector<double>,double),
"moveToPose");
...

Yes, we are using "ControlTaskServer::ThreadOrb()" within the GUI code (before
anyone asks).

Mac backtrace

^C
Program received signal SIGINT, Interrupt.
0x94db246e in __semwait_signal ()
(gdb) bt
#0  0x94db246e in __semwait_signal ()
#1  0x94ddd3e6 in _pthread_cond_wait ()
#2  0x94ddcdcd in pthread_cond_wait$UNIX2003 ()
#3  0x023c1a13 in cond_timedwait [inlined] () at OS_NS_Thread.inl:357
#4  0x023c1a13 in ACE_Condition_Thread_Mutex::wait (this=0x2f515ac,
mutex=@0x2f4dd64, abstime=0x0) at OS_NS_Thread.inl:100
#5  0x023c1abf in ACE_Condition_Thread_Mutex::wait (this=0x94db246e,
abstime=0x4) at ../../ace/Condition_Thread_Mutex.cpp:107
#6  0x01ff1b71 in TAO_Leader_Follower::wait_for_event (this=0x2f4dd60,
event=0xbfffe4e0, transport=0x2f51800, max_wait_time=0x0) at
../../../TAO/tao/Leader_Follower.cpp:280
#7  0x02045a88 in TAO_Wait_On_Leader_Follower::wait (this=0x2f517e0,
max_wait_time=0x94db246e, rd=@0xbfffe4d4) at
../../../TAO/tao/Wait_On_Leader_Follower.cpp:72
#8  0x02022951 in TAO::Synch_Twoway_Invocation::wait_for_reply
(this=0xbfffe7c8, max_wait_time=0x0, rd=@0x4, bd=@0xbfffe764) at
../../../TAO/tao/Synch_Invocation.cpp:252
#9  0x02022f29 in TAO::Synch_Twoway_Invocation::remote_twoway (this=0xbfffe7c8,
max_wait_time=0x0) at ../../../TAO/tao/Synch_Invocation.cpp:168
#10 0x01feea59 in TAO::Invocation_Adapter::invoke_twoway (this=0x2f518e8,
details=@0xbfffe96c, effective_target=@0xbfffe90c, r=@0xbfffe85c,
max_wait_time=@0xbfffe908) at ../../../TAO/tao/Invocation_Adapter.cpp:300
#11 0x01fedfa7 in TAO::Invocation_Adapter::invoke_remote_i (this=0xbfffea08,
stub=0x2f62520, details=@0xbfffe96c, effective_target=@0xbfffe90c,
max_wait_time=@0xbfffe908) at ../../../TAO/tao/Invocation_Adapter.cpp:274
#12 0x01fee594 in TAO::Invocation_Adapter::invoke_i (this=0xbfffea08,
stub=0x2f62520, details=@0xbfffe96c) at
../../../TAO/tao/Invocation_Adapter.cpp:91
#13 0x01fedd42 in ~TAO_Service_Context [inlined] () at
../../../TAO/tao/Invocation_Adapter.cpp:50
#14 0x01fedd42 in ~TAO_Service_Context [inlined] () at Service_Context.h:31
#15 0x01fedd42 in ~TAO_Operation_Details [inlined] () at
../../../TAO/tao/Invocation_Adapter.cpp:61
#16 0x01fedd42 in ~TAO_Operation_Details [inlined] () at Service_Context.h:66
#17 0x01fedd42 in TAO::Invocation_Adapter::invoke (this=0xbfffea08,
ex_data=0x4, ex_count=4) at ../../../TAO/tao/Invocation_Adapter.cpp:50
#18 0x01e10ef7 in RTT::Corba::MethodInterface::createMethod ()
#19 0x01d86f09 in RTT::Corba::CorbaMethodFactory::produce ()
#20 0x017bce0c in RTT::MethodC::D::checkAndCreate ()
#21 0x017bc3ea in RTT::MethodC::arg ()
#22 0x00023e8f in RTT::detail::DataSourceStorageImpl<1, void
()(double)>::initArgs<RTT::MethodC> (this=0x2f903d8, cc=@0x2f903dc) at
DataSourceStorage.hpp:157
#23 0x00026651 in RTT::detail::RemoteMethod<void ()(double)>::RemoteMethod
(this=0x2f903d0, of=0x2f54a34, name=@0xbfffec7c) at RemoteMethod.hpp:140
#24 0x0002907f in RTT::MethodRepository::getMethod<void ()(double)>
(this=0x2f54a34, name=@0xbfffedc4) at MethodRepository.hpp:147
#25 0x0001c3dd in robot::demo::gui::DeployerData::Initialize (this=0x2f48d00,
argc=1, argv=0xbffff0bc) at /z/l/demo/gui/DeployerData.cpp:141
#26 0x0000d7fc in robot::demo::gui::DemoMainWindow::DemoMainWindow
(this=0x2f3cde0, argc=1, argv=0xbffff0bc) at
/z/l/demo/gui/DemoMainWindow.cpp:33
#27 0x0008fe34 in main (argc=1, argv=0xbffff0bc) at /z/l/demo/gui/gui.cpp:42
(gdb) quit

SCENARIO 2: Deployer on Ubuntu and GUI attempted to run on Debian/Xenomai.

GUI aborts with

terminate called after throwing an instance of 'CORBA::TRANSIENT'
Aborted

while the deployer spits out the following

522.010 [ ERROR  ][CorbaFallBackProtocol] Failing Corba::Any creation of type
void.
522.014 [ ERROR  ][CorbaFallBackProtocol] Failing Corba::Any creation of type
void.
522.017 [ ERROR  ][CorbaFallBackProtocol] Failing Corba::Any creation of type
void.
522.021 [ ERROR  ][CorbaFallBackProtocol] Failing Corba::Any creation of type
void.
522.025 [ ERROR  ][CorbaFallBackProtocol] Failing Corba::Any creation of type
void.
522.056 [ ERROR  ][deployer-corba-gnulinux::main()] CORBA exception raised when
creating ExpressionProxy!
522.057 [ ERROR  ][deployer-corba-gnulinux::main()] system exception, ID
'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO
 
522.080 [ ERROR  ][deployer-corba-gnulinux::main()] CORBA exception raised when
creating ExpressionProxy!
522.080 [ ERROR  ][deployer-corba-gnulinux::main()] system exception, ID
'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO

SCENARIO 4: Deployer on Ubuntu Hardy (P4 CPU) and the GUI attempted to run on
another Ubuntu Hardy installation (VIA C7 Pico ITX). We get the deployer output
of SCENARIO 3, but the GUI crashes with the following backtrace. The Hardy
installation on the Pico is held back to 2 2.6.24-23-rt kernel due to hardware
conflicts, while the P4 installation was completely upgraded yesterday to
2.6.24-24-rt. There might be some very tiny differences here, but the ACE
library is the exact same version.

GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
(gdb) r
Starting program: /opt/build/nrl/liive/demo/gui/demogui 
[Thread debugging using libthread_db enabled]
[New Thread 0xb54836c0 (LWP 6644)]
Qt: gdb: -nograb added to command-line options.
     Use the -dograb option to enforce grabbing.
[New Thread 0xb528eb90 (LWP 6647)]
terminate called after throwing an instance of 'CORBA::TRANSIENT'
 
Program received signal SIGABRT, Aborted.
[Switching to Thread 0xb54836c0 (LWP 6644)]
0xb7efe410 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7efe410 in __kernel_vsyscall ()
#1  0xb58dd085 in raise () from /lib/tls/i686/cmov/libc.so.6
#2  0xb58dea01 in abort () from /lib/tls/i686/cmov/libc.so.6
#3  0xb5aec480 in __gnu_cxx::__verbose_terminate_handler () from
/usr/lib/libstdc++.so.6
#4  0xb5ae9d05 in ?? () from /usr/lib/libstdc++.so.6
#5  0xb5ae9d42 in std::terminate () from /usr/lib/libstdc++.so.6
#6  0xb5ae9df8 in __cxa_rethrow () from /usr/lib/libstdc++.so.6
#7  0xb623c463 in TAO::Synch_Twoway_Invocation::remote_twoway () from
/usr/lib/libTAO.so.1.4.7
#8  0xb61f1c7f in TAO::Invocation_Adapter::invoke_twoway () from
/usr/lib/libTAO.so.1.4.7
#9  0xb61f17d7 in TAO::Invocation_Adapter::invoke_remote_i () from
/usr/lib/libTAO.so.1.4.7
#10 0xb61f1894 in TAO::Invocation_Adapter::invoke_i () from
/usr/lib/libTAO.so.1.4.7
#11 0xb61f1ddb in TAO::Invocation_Adapter::invoke () from
/usr/lib/libTAO.so.1.4.7
#12 0xb6bb43f0 in RTT::Corba::MethodInterface::createMethod ()
   from /opt/install/lib/liborocos-rtt-corba-gnulinux.so.1.6
#13 0xb6b83837 in RTT::Corba::CorbaMethodFactory::produce ()
   from /opt/install/lib/liborocos-rtt-corba-gnulinux.so.1.6
#14 0xb6e698de in RTT::MethodC::D::checkAndCreate () from
/opt/install/lib/liborocos-rtt-gnulinux.so.1.6
#15 0xb6e68c54 in RTT::MethodC::arg () from
/opt/install/lib/liborocos-rtt-gnulinux.so.1.6
#16 0x08165ff9 in RTT::detail::DataSourceStorageImpl<2, void
()(std::vector<double, std::allocator<double> >,
double)>::initArgs<RTT::MethodC> (this=0x8297748, cc=@0x8297750) at
/opt/install/include/rtt/DataSourceStorage.hpp:177
#17 0x081660dc in RemoteMethod (this=0x8297740, of=0x823f164, name=@0xbf8a7c64)
    at /opt/install/include/rtt/RemoteMethod.hpp:140
#18 0x08166d13 in RTT::MethodRepository::getMethod<void ()(std::vector<double,
std::allocator<double> >, double)>
    (this=0x823f164, name=@0xbf8a7e48) at
/opt/install/include/rtt/MethodRepository.hpp:147
#19 0x08117c09 in robot::demo::gui::DeployerData::Initialize (this=0x820c988,
argc=1, argv=0xbf8a8094)
    at /z/l/demo/gui/DeployerData.cpp:162
#20 0x0810ceb2 in DemoMainWindow (this=0x820be50, argc=1, argv=0xbf8a8094) at
/z/l/demo/gui/DemoMainWindow.cpp:33
#21 0x081837db in main (argc=1, argv=0xbf8a8094) at /z/l/demo/gui/gui.cpp:42
(gdb) quit
The program is running.  Exit anyway? (y or n) y

Any ideas?

[Bug 673] Unable mix different CORBA library versions

https://www.fmtc.be/bugzilla/orocos/show_bug.cgi?id=673

Peter Soetens <peter [..] ...> changed:

What |Removed |Added
----------------------------------------------------------------------------
Resolution| |INVALID
Status|NEW |RESOLVED

--- Comment #2 from Peter Soetens <peter [..] ...> 2009-06-21 22:23:38 ---
It turned out to be a misconfiguration. See
http://orocos.org/wiki/rtt/frequently-asked-questions-faq/using-corba