Patching SOEM to remove critical gettimeofday() calls

SOEM, as contained in the soem_core ROS stack and used by the Orocos
ethercat master component, makes a couple of gettimeofday() calls in code
that should be HRT-safe. These calls are not HRT-safe under either
RT_PREEMPT (as mentioned earlier on this mailinglist) or under Xenomai.

We utilize SOEM as our EtherCAT master library, but do not utilize
redundancy. Therefore, the two calls that we currently need to patch (and
are able to test patches for) are the ones on lines 577 and 647 of
nicdrv.c. Currently, we simply replace them with clock_gettime() calls and
some timespec -> timeval conversions. That solution, however, is a bit ugly
to get included in the main EtherCAT master component repository.

I've come up with 2 solutions that are cleaner than our current
patch<http://code.google.com/p/atrias/source/browse/patches_etc/soem_rt_fix.patch>
:

1) Replace all timeval's with timespec structs; implement a new timeradd,
and timercmp, and switch out all code in the ec_waitinframe and
ec_waitinframe_red functions to use clock_gettime and timespecs.

2) Implement a rt_gettimeofday() function that is a drop-in replacement for
gettimeofday; replace the "troublesome" instances of gettimeofday with this
new function.

In my opinion, solution 2 would be preferable, since solution 1) is rather
invasive (there's a lot of timevals passed around in those functions) and
we would not be able to test all code paths (specifically, we cannot test
the redundancy paths). However, before going ahead with solution 2 and
submitting a patch, I wanted to ask for your opinions.

Should I go ahead with plan 2 for creating a patch? If not, what should I
do instead and why?

Thank You,
Johnathan Van Why
Oregon State University
Dynamic Robotics Laboratory

Patching SOEM to remove critical gettimeofday() calls

Hi Johnathan,

On Fri, Dec 28, 2012 at 7:14 PM, Johnathan Van Why <jrvanwhy [..] ...>wrote:

> SOEM, as contained in the soem_core ROS stack and used by the Orocos
> ethercat master component, makes a couple of gettimeofday() calls in code
> that should be HRT-safe. These calls are not HRT-safe under either
> RT_PREEMPT (as mentioned earlier on this mailinglist) or under Xenomai.
>

Indeed.

>
> We utilize SOEM as our EtherCAT master library, but do not utilize
> redundancy. Therefore, the two calls that we currently need to patch (and
> are able to test patches for) are the ones on lines 577 and 647 of
> nicdrv.c. Currently, we simply replace them with clock_gettime() calls and
> some timespec -> timeval conversions. That solution, however, is a bit ugly
> to get included in the main EtherCAT master component repository.
>
> I've come up with 2 solutions that are cleaner than our current patch<http://code.google.com/p/atrias/source/browse/patches_etc/soem_rt_fix.patch>
> :
>
> 1) Replace all timeval's with timespec structs; implement a new timeradd,
> and timercmp, and switch out all code in the ec_waitinframe and
> ec_waitinframe_red functions to use clock_gettime and timespecs.
>
> 2) Implement a rt_gettimeofday() function that is a drop-in replacement
> for gettimeofday; replace the "troublesome" instances of gettimeofday with
> this new function.
>
> In my opinion, solution 2 would be preferable, since solution 1) is rather
> invasive (there's a lot of timevals passed around in those functions) and
> we would not be able to test all code paths (specifically, we cannot test
> the redundancy paths). However, before going ahead with solution 2 and
> submitting a patch, I wanted to ask for your opinions.
>
> Should I go ahead with plan 2 for creating a patch? If not, what should I
> do instead and why?
>

I would at first glance also go for option 2. Other Orocos code is also
still not fixed for this case and may benefit from your solution (even if
it serves only as an example), see
http://bugs.orocos.org/show_bug.cgi?id=881

A patch only targeting Xenomai 2.6.x would not be a problem imho.

Peter

Patching SOEM to remove critical gettimeofday() calls

On Sat, Dec 29, 2012 at 4:06 PM, Peter Soetens <peter [..] ...> wrote:
>
> Hi Johnathan,
>
> On Fri, Dec 28, 2012 at 7:14 PM, Johnathan Van Why <jrvanwhy [..] ...>
> wrote:
>>
>> SOEM, as contained in the soem_core ROS stack and used by the Orocos
>> ethercat master component, makes a couple of gettimeofday() calls in code
>> that should be HRT-safe. These calls are not HRT-safe under either
>> RT_PREEMPT (as mentioned earlier on this mailinglist) or under Xenomai.
>
>
> Indeed.
>
>>
>>
>> We utilize SOEM as our EtherCAT master library, but do not utilize
>> redundancy. Therefore, the two calls that we currently need to patch (and
>> are able to test patches for) are the ones on lines 577 and 647 of nicdrv.c.
>> Currently, we simply replace them with clock_gettime() calls and some
>> timespec -> timeval conversions. That solution, however, is a bit ugly to
>> get included in the main EtherCAT master component repository.
>>
>> I've come up with 2 solutions that are cleaner than our current patch:
>>
>> 1) Replace all timeval's with timespec structs; implement a new timeradd,
>> and timercmp, and switch out all code in the ec_waitinframe and
>> ec_waitinframe_red functions to use clock_gettime and timespecs.
>>
>> 2) Implement a rt_gettimeofday() function that is a drop-in replacement
>> for gettimeofday; replace the "troublesome" instances of gettimeofday with
>> this new function.
>>
>> In my opinion, solution 2 would be preferable, since solution 1) is
>> rather invasive (there's a lot of timevals passed around in those functions)
>> and we would not be able to test all code paths (specifically, we cannot
>> test the redundancy paths). However, before going ahead with solution 2 and
>> submitting a patch, I wanted to ask for your opinions.
>>
>> Should I go ahead with plan 2 for creating a patch? If not, what should I
>> do instead and why?
>
>
> I would at first glance also go for option 2. Other Orocos code is also
> still not fixed for this case and may benefit from your solution (even if it
> serves only as an example), see http://bugs.orocos.org/show_bug.cgi?id=881
>
> A patch only targeting Xenomai 2.6.x would not be a problem imho.
>
> Peter
>

Okay, I've completed my patch. This fixes the slaveinfo executable
build under Xenomai (this has never worked for us; we've been
disabling it to make soem_core build successfully) and makes SOEM
realtime-safe if no redundancy is utilized.

Attached are two files -- soem.patch should be applied via "git apply
soem.patch" -- this patches CMakeLists.txt (to link things correctly)
and the Makefile (to apply rt_fix.patch). rt_fix.patch should simply
be dropped into soem/soem_core and committed to the repository -- this
patches nicdrv.c

For anyone who needs HRT redundant operation, look in the nicdrv.c
file, and replace all remaining calls in ec_waitinframe_red to
gettimeofday(*tv, 0) with calls to rt_gettimeofday(*tv). I haven't
done this myself as I have no way to test that code.

Please let me know if you have any questions,
Johnathan Van Why
Dynamic Robotics Laboratory
Oregon State University

Patching SOEM to remove critical gettimeofday() calls

Hi Jonathan,

On Wed, Jan 02, 2013 at 12:11:14PM -0800, Johnathan Van Why wrote:
> On Sat, Dec 29, 2012 at 4:06 PM, Peter Soetens <peter [..] ...> wrote:
> >
> > Hi Johnathan,
> >
> > On Fri, Dec 28, 2012 at 7:14 PM, Johnathan Van Why <jrvanwhy [..] ...>
> > wrote:
> >>
> >> SOEM, as contained in the soem_core ROS stack and used by the Orocos
> >> ethercat master component, makes a couple of gettimeofday() calls in code
> >> that should be HRT-safe. These calls are not HRT-safe under either
> >> RT_PREEMPT (as mentioned earlier on this mailinglist) or under Xenomai.
> >
> >
> > Indeed.
> >
> >>
> >>
> >> We utilize SOEM as our EtherCAT master library, but do not utilize
> >> redundancy. Therefore, the two calls that we currently need to patch (and
> >> are able to test patches for) are the ones on lines 577 and 647 of nicdrv.c.
> >> Currently, we simply replace them with clock_gettime() calls and some
> >> timespec -> timeval conversions. That solution, however, is a bit ugly to
> >> get included in the main EtherCAT master component repository.
> >>
> >> I've come up with 2 solutions that are cleaner than our current patch:
> >>
> >> 1) Replace all timeval's with timespec structs; implement a new timeradd,
> >> and timercmp, and switch out all code in the ec_waitinframe and
> >> ec_waitinframe_red functions to use clock_gettime and timespecs.
> >>
> >> 2) Implement a rt_gettimeofday() function that is a drop-in replacement
> >> for gettimeofday; replace the "troublesome" instances of gettimeofday with
> >> this new function.
> >>
> >> In my opinion, solution 2 would be preferable, since solution 1) is
> >> rather invasive (there's a lot of timevals passed around in those functions)
> >> and we would not be able to test all code paths (specifically, we cannot
> >> test the redundancy paths). However, before going ahead with solution 2 and
> >> submitting a patch, I wanted to ask for your opinions.
> >>
> >> Should I go ahead with plan 2 for creating a patch? If not, what should I
> >> do instead and why?
> >
> >
> > I would at first glance also go for option 2. Other Orocos code is also
> > still not fixed for this case and may benefit from your solution (even if it
> > serves only as an example), see http://bugs.orocos.org/show_bug.cgi?id=881
> >
> > A patch only targeting Xenomai 2.6.x would not be a problem imho.
> >
> > Peter
> >
>
> Okay, I've completed my patch. This fixes the slaveinfo executable
> build under Xenomai (this has never worked for us; we've been
> disabling it to make soem_core build successfully) and makes SOEM
> realtime-safe if no redundancy is utilized.
>
> Attached are two files -- soem.patch should be applied via "git apply
> soem.patch" -- this patches CMakeLists.txt (to link things correctly)
> and the Makefile (to apply rt_fix.patch). rt_fix.patch should simply
> be dropped into soem/soem_core and committed to the repository -- this
> patches nicdrv.c
>
> For anyone who needs HRT redundant operation, look in the nicdrv.c
> file, and replace all remaining calls in ec_waitinframe_red to
> gettimeofday(*tv, 0) with calls to rt_gettimeofday(*tv). I haven't
> done this myself as I have no way to test that code.

Could you also post the SOEM parts of the patch on the
SOEM-mailinglist? This should really be merged upstream...

Markus

Patching SOEM to remove critical gettimeofday() calls

On Wed, Jan 2, 2013 at 11:19 PM, Markus Klotzbuecher
<markus [dot] klotzbuecher [..] ...> wrote:
> Hi Jonathan,
>
> On Wed, Jan 02, 2013 at 12:11:14PM -0800, Johnathan Van Why wrote:
>> On Sat, Dec 29, 2012 at 4:06 PM, Peter Soetens <peter [..] ...> wrote:
>> >
>> > Hi Johnathan,
>> >
>> > On Fri, Dec 28, 2012 at 7:14 PM, Johnathan Van Why <jrvanwhy [..] ...>
>> > wrote:
>> >>
>> >> SOEM, as contained in the soem_core ROS stack and used by the Orocos
>> >> ethercat master component, makes a couple of gettimeofday() calls in code
>> >> that should be HRT-safe. These calls are not HRT-safe under either
>> >> RT_PREEMPT (as mentioned earlier on this mailinglist) or under Xenomai.
>> >
>> >
>> > Indeed.
>> >
>> >>
>> >>
>> >> We utilize SOEM as our EtherCAT master library, but do not utilize
>> >> redundancy. Therefore, the two calls that we currently need to patch (and
>> >> are able to test patches for) are the ones on lines 577 and 647 of nicdrv.c.
>> >> Currently, we simply replace them with clock_gettime() calls and some
>> >> timespec -> timeval conversions. That solution, however, is a bit ugly to
>> >> get included in the main EtherCAT master component repository.
>> >>
>> >> I've come up with 2 solutions that are cleaner than our current patch:
>> >>
>> >> 1) Replace all timeval's with timespec structs; implement a new timeradd,
>> >> and timercmp, and switch out all code in the ec_waitinframe and
>> >> ec_waitinframe_red functions to use clock_gettime and timespecs.
>> >>
>> >> 2) Implement a rt_gettimeofday() function that is a drop-in replacement
>> >> for gettimeofday; replace the "troublesome" instances of gettimeofday with
>> >> this new function.
>> >>
>> >> In my opinion, solution 2 would be preferable, since solution 1) is
>> >> rather invasive (there's a lot of timevals passed around in those functions)
>> >> and we would not be able to test all code paths (specifically, we cannot
>> >> test the redundancy paths). However, before going ahead with solution 2 and
>> >> submitting a patch, I wanted to ask for your opinions.
>> >>
>> >> Should I go ahead with plan 2 for creating a patch? If not, what should I
>> >> do instead and why?
>> >
>> >
>> > I would at first glance also go for option 2. Other Orocos code is also
>> > still not fixed for this case and may benefit from your solution (even if it
>> > serves only as an example), see http://bugs.orocos.org/show_bug.cgi?id=881
>> >
>> > A patch only targeting Xenomai 2.6.x would not be a problem imho.
>> >
>> > Peter
>> >
>>
>> Okay, I've completed my patch. This fixes the slaveinfo executable
>> build under Xenomai (this has never worked for us; we've been
>> disabling it to make soem_core build successfully) and makes SOEM
>> realtime-safe if no redundancy is utilized.
>>
>> Attached are two files -- soem.patch should be applied via "git apply
>> soem.patch" -- this patches CMakeLists.txt (to link things correctly)
>> and the Makefile (to apply rt_fix.patch). rt_fix.patch should simply
>> be dropped into soem/soem_core and committed to the repository -- this
>> patches nicdrv.c
>>
>> For anyone who needs HRT redundant operation, look in the nicdrv.c
>> file, and replace all remaining calls in ec_waitinframe_red to
>> gettimeofday(*tv, 0) with calls to rt_gettimeofday(*tv). I haven't
>> done this myself as I have no way to test that code.
>
> Could you also post the SOEM parts of the patch on the
> SOEM-mailinglist? This should really be merged upstream...
>
> Markus

I have emailed the SOEM mailing list asking if they would like a copy
of this patch, but have not received any response.

Would it be possible for this patch to be merged into the Orocos SOEM
stack anyway?

Thank You,
Johnathan Van Why
Dynamic Robotics Laboratory
Oregon State University

Ruben Smits's picture

Patching SOEM to remove critical gettimeofday() calls

On Fri, Feb 15, 2013 at 11:19 PM, Johnathan Van Why <jrvanwhy [..] ...>wrote:

> On Wed, Jan 2, 2013 at 11:19 PM, Markus Klotzbuecher
> <markus [dot] klotzbuecher [..] ...> wrote:
> > Hi Jonathan,
> >
> > On Wed, Jan 02, 2013 at 12:11:14PM -0800, Johnathan Van Why wrote:
> >> On Sat, Dec 29, 2012 at 4:06 PM, Peter Soetens <
> peter [..] ...> wrote:
> >> >
> >> > Hi Johnathan,
> >> >
> >> > On Fri, Dec 28, 2012 at 7:14 PM, Johnathan Van Why <
> jrvanwhy [..] ...>
> >> > wrote:
> >> >>
> >> >> SOEM, as contained in the soem_core ROS stack and used by the Orocos
> >> >> ethercat master component, makes a couple of gettimeofday() calls in
> code
> >> >> that should be HRT-safe. These calls are not HRT-safe under either
> >> >> RT_PREEMPT (as mentioned earlier on this mailinglist) or under
> Xenomai.
> >> >
> >> >
> >> > Indeed.
> >> >
> >> >>
> >> >>
> >> >> We utilize SOEM as our EtherCAT master library, but do not utilize
> >> >> redundancy. Therefore, the two calls that we currently need to patch
> (and
> >> >> are able to test patches for) are the ones on lines 577 and 647 of
> nicdrv.c.
> >> >> Currently, we simply replace them with clock_gettime() calls and some
> >> >> timespec -> timeval conversions. That solution, however, is a bit
> ugly to
> >> >> get included in the main EtherCAT master component repository.
> >> >>
> >> >> I've come up with 2 solutions that are cleaner than our current
> patch:
> >> >>
> >> >> 1) Replace all timeval's with timespec structs; implement a new
> timeradd,
> >> >> and timercmp, and switch out all code in the ec_waitinframe and
> >> >> ec_waitinframe_red functions to use clock_gettime and timespecs.
> >> >>
> >> >> 2) Implement a rt_gettimeofday() function that is a drop-in
> replacement
> >> >> for gettimeofday; replace the "troublesome" instances of
> gettimeofday with
> >> >> this new function.
> >> >>
> >> >> In my opinion, solution 2 would be preferable, since solution 1) is
> >> >> rather invasive (there's a lot of timevals passed around in those
> functions)
> >> >> and we would not be able to test all code paths (specifically, we
> cannot
> >> >> test the redundancy paths). However, before going ahead with
> solution 2 and
> >> >> submitting a patch, I wanted to ask for your opinions.
> >> >>
> >> >> Should I go ahead with plan 2 for creating a patch? If not, what
> should I
> >> >> do instead and why?
> >> >
> >> >
> >> > I would at first glance also go for option 2. Other Orocos code is
> also
> >> > still not fixed for this case and may benefit from your solution
> (even if it
> >> > serves only as an example), see
> http://bugs.orocos.org/show_bug.cgi?id=881
> >> >
> >> > A patch only targeting Xenomai 2.6.x would not be a problem imho.
> >> >
> >> > Peter
> >> >
> >>
> >> Okay, I've completed my patch. This fixes the slaveinfo executable
> >> build under Xenomai (this has never worked for us; we've been
> >> disabling it to make soem_core build successfully) and makes SOEM
> >> realtime-safe if no redundancy is utilized.
> >>
> >> Attached are two files -- soem.patch should be applied via "git apply
> >> soem.patch" -- this patches CMakeLists.txt (to link things correctly)
> >> and the Makefile (to apply rt_fix.patch). rt_fix.patch should simply
> >> be dropped into soem/soem_core and committed to the repository -- this
> >> patches nicdrv.c
> >>
> >> For anyone who needs HRT redundant operation, look in the nicdrv.c
> >> file, and replace all remaining calls in ec_waitinframe_red to
> >> gettimeofday(*tv, 0) with calls to rt_gettimeofday(*tv). I haven't
> >> done this myself as I have no way to test that code.
> >
> > Could you also post the SOEM parts of the patch on the
> > SOEM-mailinglist? This should really be merged upstream...
> >
> > Markus
>
> I have emailed the SOEM mailing list asking if they would like a copy
> of this patch, but have not received any response.
>
> Would it be possible for this patch to be merged into the Orocos SOEM
> stack anyway?
>
>
We already have some patches (rtnet related) on top of soem upstream in the
Orocos SOEM stack. I think we can host some more patches for the time being
until they get merged into upstream.

Ruben

> Thank You,
> Johnathan Van Why
> Dynamic Robotics Laboratory
> Oregon State University
> --
> Orocos-Users mailing list
> Orocos-Users [..] ...
> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users
>

Patching SOEM to remove critical gettimeofday() calls

On Mon, Feb 18, 2013 at 12:03 PM, Ruben Smits
<ruben [dot] smits [..] ...> wrote:
>
> On Fri, Feb 15, 2013 at 11:19 PM, Johnathan Van Why <jrvanwhy [..] ...>
> wrote:
>>
>> On Wed, Jan 2, 2013 at 11:19 PM, Markus Klotzbuecher
>> <markus [dot] klotzbuecher [..] ...> wrote:
>> > Hi Jonathan,
>> >
>> > On Wed, Jan 02, 2013 at 12:11:14PM -0800, Johnathan Van Why wrote:
>> >> On Sat, Dec 29, 2012 at 4:06 PM, Peter Soetens
>> >> <peter [..] ...> wrote:
>> >> >
>> >> > Hi Johnathan,
>> >> >
>> >> > On Fri, Dec 28, 2012 at 7:14 PM, Johnathan Van Why
>> >> > <jrvanwhy [..] ...>
>> >> > wrote:
>> >> >>
>> >> >> SOEM, as contained in the soem_core ROS stack and used by the Orocos
>> >> >> ethercat master component, makes a couple of gettimeofday() calls in
>> >> >> code
>> >> >> that should be HRT-safe. These calls are not HRT-safe under either
>> >> >> RT_PREEMPT (as mentioned earlier on this mailinglist) or under
>> >> >> Xenomai.
>> >> >
>> >> >
>> >> > Indeed.
>> >> >
>> >> >>
>> >> >>
>> >> >> We utilize SOEM as our EtherCAT master library, but do not utilize
>> >> >> redundancy. Therefore, the two calls that we currently need to patch
>> >> >> (and
>> >> >> are able to test patches for) are the ones on lines 577 and 647 of
>> >> >> nicdrv.c.
>> >> >> Currently, we simply replace them with clock_gettime() calls and
>> >> >> some
>> >> >> timespec -> timeval conversions. That solution, however, is a bit
>> >> >> ugly to
>> >> >> get included in the main EtherCAT master component repository.
>> >> >>
>> >> >> I've come up with 2 solutions that are cleaner than our current
>> >> >> patch:
>> >> >>
>> >> >> 1) Replace all timeval's with timespec structs; implement a new
>> >> >> timeradd,
>> >> >> and timercmp, and switch out all code in the ec_waitinframe and
>> >> >> ec_waitinframe_red functions to use clock_gettime and timespecs.
>> >> >>
>> >> >> 2) Implement a rt_gettimeofday() function that is a drop-in
>> >> >> replacement
>> >> >> for gettimeofday; replace the "troublesome" instances of
>> >> >> gettimeofday with
>> >> >> this new function.
>> >> >>
>> >> >> In my opinion, solution 2 would be preferable, since solution 1) is
>> >> >> rather invasive (there's a lot of timevals passed around in those
>> >> >> functions)
>> >> >> and we would not be able to test all code paths (specifically, we
>> >> >> cannot
>> >> >> test the redundancy paths). However, before going ahead with
>> >> >> solution 2 and
>> >> >> submitting a patch, I wanted to ask for your opinions.
>> >> >>
>> >> >> Should I go ahead with plan 2 for creating a patch? If not, what
>> >> >> should I
>> >> >> do instead and why?
>> >> >
>> >> >
>> >> > I would at first glance also go for option 2. Other Orocos code is
>> >> > also
>> >> > still not fixed for this case and may benefit from your solution
>> >> > (even if it
>> >> > serves only as an example), see
>> >> > http://bugs.orocos.org/show_bug.cgi?id=881
>> >> >
>> >> > A patch only targeting Xenomai 2.6.x would not be a problem imho.
>> >> >
>> >> > Peter
>> >> >
>> >>
>> >> Okay, I've completed my patch. This fixes the slaveinfo executable
>> >> build under Xenomai (this has never worked for us; we've been
>> >> disabling it to make soem_core build successfully) and makes SOEM
>> >> realtime-safe if no redundancy is utilized.
>> >>
>> >> Attached are two files -- soem.patch should be applied via "git apply
>> >> soem.patch" -- this patches CMakeLists.txt (to link things correctly)
>> >> and the Makefile (to apply rt_fix.patch). rt_fix.patch should simply
>> >> be dropped into soem/soem_core and committed to the repository -- this
>> >> patches nicdrv.c
>> >>
>> >> For anyone who needs HRT redundant operation, look in the nicdrv.c
>> >> file, and replace all remaining calls in ec_waitinframe_red to
>> >> gettimeofday(*tv, 0) with calls to rt_gettimeofday(*tv). I haven't
>> >> done this myself as I have no way to test that code.
>> >
>> > Could you also post the SOEM parts of the patch on the
>> > SOEM-mailinglist? This should really be merged upstream...
>> >
>> > Markus
>>
>> I have emailed the SOEM mailing list asking if they would like a copy
>> of this patch, but have not received any response.
>>
>> Would it be possible for this patch to be merged into the Orocos SOEM
>> stack anyway?
>>
>
> We already have some patches (rtnet related) on top of soem upstream in the
> Orocos SOEM stack. I think we can host some more patches for the time being
> until they get merged into upstream.
>
> Ruben

We just found out today that linking with rtnet is disabled in the
Makefile in the soem_core repository. The following line should be
removed from that Makefile:
EXTRA_CMAKE_FLAGS=-DENABLE_RTNET=OFF

The commit message for the commit that added this line mentions driver
updates and additions, so I do not think this is intentional.

Thank You,
Johnathan Van Why

>
>>
>> Thank You,
>> Johnathan Van Why
>> Dynamic Robotics Laboratory
>> Oregon State University
>> --
>> Orocos-Users mailing list
>> Orocos-Users [..] ...
>> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-users
>
>