[Olsr-users] olsrd 0.6.6.1 (and earlier) ipv6 problems

Russell Senior (spam-protected)
Thu Mar 27 10:37:39 CET 2014


>>>>> "Henning" == Henning Rogge <(spam-protected)> writes:

Henning> On 03/26/2014 07:41 PM, Russell Senior wrote:
>> Anybody get a chance to look at the strace?  I see a:

Henning> strace and packet dumps are much too lowlevel to directly
Henning> hunt problems like this. Thats why Saverios question about
Henning> txtinfo good, because it gives you a much more high-level
Henning> view on what is going on.

I had not installed the modules previously, so that interface wasn't
immediately available.  It is now.

[...]

Henning> Okay, lets get back to the high-level view.

Henning> To interpret the events you described we need a list of
Henning> nodes, with their interface IPs and the connectivity between
Henning> them. 

Here is the list of neighbors of 2001:470:e962::407.  The addresses
listed are on the public wifi.  The OpenVPN addresses of each node are
a permutation, e.g. if the public wifi addr is 2001:470:e962:wxyz::1,
then the OpenVPN address of the node is 2001:470:e962::wxyz.

None of the nodes connect directly, everything goes through ::407.

>From curl -6 http://localhost:$port/neighbors

  https://personaltelco.net/~russell/olsrd/olsrd-neighbors.txt

Henning> I am also a bit worried about your usage of bridges
Henning> connected to mesh interfaces.  Normally you should no bridge
Henning> any interface that OLSR uses for meshing.  Mixing routing
Henning> (L3) and bridging (L2) can go wrong in very creative ways.

I don't understand how the bridges could be a problem in this case.
This is a hub and spoke topology.  One openvpn server in the middle,
nodes at the edges.  None of the nodes interconnect otherwise.  Olsr
is broadcast on the wifi in case there are any olsrd devices nearby,
but, again, there is no overlap in the wifi coverage (and if there
were physically, they are on different SSIDs and wouldn't overlap
logically).

Can you explain more about what in particularly would make you worry?
This configuration has been stable for us on ipv4 for years and also
on ipv6 until very recently, since late 2012 at least.  So, I suspect
a bug.  Somewhere.

Henning> Txtinfo output would be good (especially /route) would be
Henning> good to see...  before the problem, during the problem and
Henning> after the recovery.

I'm using curl -6 http://localhost:$port/routes to get the following
data, before, during and after turning on an ipv6 olsrd on a
particular node (2001:470:e962:11c1::1).

  https://personaltelco.net/~russell/olsrd/olsrd-routes-before.txt
  https://personaltelco.net/~russell/olsrd/olsrd-routes-during.txt
  https://personaltelco.net/~russell/olsrd/olsrd-routes-after.txt

Henning> It would also help if you can reduce the number of nodes
Henning> while still replicating the problem to a minimum.

I don't have that level of control, unfortunately.  When I notice that
the ipv6 routes have collapsed, I pick a likely seeming node (maybe
because it had been plugged in recently) and turn off ipv6 olsrd, and
over 30-60 seconds, magically the routes all come back.  My luck in
guessing the right node to turn off is a little bit "too good", if you
know what I mean, so that I am not sure there is anything particularly
unique about the node I choose.  But, nevertheless, turning it off
seems to help, generally.

FWIW, I'm including olsrd versions here.  The central machine ::407 is
running 0.6.6.1, compiled from the tarball.  The nodes have the
following versions, all built from openwrt routing feed sources.

  https://personaltelco.net/~russell/olsrd/olsrd-versions-by-node.txt

Here is a table listing the frequency of each openwrt version:

      1 0.6.3-3
     33 0.6.4-1
      1 0.6.5.1-1
      1 0.6.5.1-2
      7 0.6.5.2-1
      1 0.6.5.3-1
      2 0.6.5.4-1
      2 0.6.6-2
      7 0.6.6-3
     11 0.6.6.1-1


-- 
Russell Senior, President
(spam-protected)




More information about the Olsr-users mailing list