[Olsr-dev] alignment faults on ARM processors

Peter Bigot (spam-protected)
Sat Aug 4 00:41:51 CEST 2012


On Fri Jul 27 08:50:52 CEST 2012, Henning Rogge wrote:
> On Thu, Jul 26, 2012 at 9:11 AM, Peter Bigot <(spam-protected)> wrote:
> > I was recently asked to validate OLSRD on a custom embedded platform
> > involving an AT91 ARM processor and Linux 2.6.39 under OpenEmbedded
> > Classic.  While the 0.5.5 release worked, 0.5.6-r8 and 0.6.3 produced
> > a cascade of errors attempting to route to invalid IP addresses.
> >
> > A kernel error message indicated unaligned accesses.  I tracked this
> > down to several uses of the following idiom in src/lq_packet.h:
> >
> >   *var = ntohl(**((const uint32_t **)p));
> >
> > where p is a pointer to a pointer to const uint8_t.  It seems that in
> > recent Linux kernels CONFIG_ALIGNMENT_TRAP is enabled by default on
> > these ARM chips, and the version of Sourcery_G++_Lite required on this
> > platform does not support -mno-alignment-traps.  In practice, the
> > pointer at this point did not satisfy the alignment requirements of
> > the processor, and the kernel did not provide a recovery.
> >
> > I worked around this by replacing the assignment with an intermediate
> > copy from the unaligned buffer into a 4-byte region in a local union
> > with a uint32_t; similarly for the int16_t accesses.  I regret that I
> > can't provide the patch (which is both trivial and a hack anyway), but
> > thought a heads-up might be helpful for others trying to make OLSRD
> > work in a similar environment.
> >
> > Peter
>
> Hmm... I think all original OLSR messages inside the OLSR packet are
> well aligned, as are the fields inside the messages. Since 0.6.0 we even
> have checks if the length of a message cannot be
>
> The fact that an uint8_t pointer is cast to 32 bit is okay, because the
> uint8_t is a buffer with the packet, which contains all kind of fields
> (but still should be well aligned).
>
> Can you explain a little bit more why the kernel is misbehaving? If we
> would have unaligned word/byte access, the MIPS architecture would crash
> every time you run OLSRd.

I can't explain why it happens, just that I got the kernel alignment traps
until I forced the alignment as described above, at which point they stopped
happening.  In this case I was using a Marvell driver for the SD8787 (one
from Marvell, not the one in the Linux kernel).  Whoever's responsible for
placing the data at an unaligned location, the point is that the alignment
requirements of the CPU weren't being satisfied and OLSR wasn't checking for
it.

I was only asked to do the basic validation of the port; I'm not actively
involved in the development based on OLSR, so probably won't be able to
provide any more detail.  But I did want to get a record of the issue
out into the
archive so the next person who runs into the problem might be able to figure
it out faster.

Peter




More information about the Olsr-dev mailing list