[Olsr-users] [Olsr-dev] olsrd 0.6.6.1 (and earlier) ipv6 problems

Russell Senior (spam-protected)
Fri Mar 28 11:03:04 CET 2014


The are single hop from the central server, which is the table I've been
posting.


On Fri, Mar 28, 2014 at 3:01 AM, Henning Rogge <(spam-protected)> wrote:

> What?
>
> but your routing tables only contains "ETX 1.0" paths... which means
> they are single hop!
>
> Henning
>
> On Fri, Mar 28, 2014 at 11:00 AM, Russell Senior
> <(spam-protected)> wrote:
> > Without the ipv6 olsrd, the nodes can't route to each other, it seems.  I
> > picked two I had turned off, and tried ping6'ing between them and got
> 100%
> > packet loss.
> >
> >
> > On Fri, Mar 28, 2014 at 2:54 AM, Henning Rogge <(spam-protected)> wrote:
> >>
> >> Hi,
> >>
> >> as far as I can see each "leaf" node can see each other leaf node over
> >> the OpenVPN, right?
> >>
> >> So you are only using Olsrd to distribute HNAs?
> >>
> >> Henning Rogge
> >>
> >> On Fri, Mar 28, 2014 at 10:48 AM, Russell Senior
> >> <(spam-protected)> wrote:
> >> > The central server, ::407, is running OpenVPN in server mode.  The
> >> > "leaf"
> >> > nodes all connect to it via OpenVPN client mode with a tap interface.
> >> > We
> >> > statically provision the IPv6 addresses on the vpn.
> >> >
> >> > And yes, the OpenVPN links are still active.  We are running an IPv4
> >> > instance of olsrd (same version) in parallel and those routes (to the
> >> > very
> >> > same devices) are not affected.
> >> >
> >> > We see the problem when particular (though varying) nodes olsrd ipv6
> >> > instances are started/stopped.  Sometimes the nodes are running
> 0.6.6.1,
> >> > and
> >> > sometimes 0.6.4.  It doesn't seem to be specific.  The central server
> is
> >> > running 0.6.6.1 now, but we saw the same thing earlier (which is why I
> >> > upgraded) on 0.6.4.
> >> >
> >> > One other potential clue (it doesn't make very much sense, because I
> >> > know
> >> > there are much bigger networks than ours), I've never seen more than
> 186
> >> > ipv6 routes on ::407.  We seem to see the problem when we try to
> exceed
> >> > that.  I'm going to try to confirm that.
> >> >
> >> >
> >> > On Fri, Mar 28, 2014 at 2:34 AM, Henning Rogge <(spam-protected)>
> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I must admit that I am not convinced that its an Olsrd bug what we
> are
> >> >> seeing...
> >> >>
> >> >> If I see it correctly Olsrd is running over the VPN interface
> >> >> connection (interface name "vpn"), right?
> >> >>
> >> >> Is the VPN connection between the nodes still active during the route
> >> >> loss? Most of the nodes seem to have direct connections and the "30
> >> >> seconds until recovery" sounds like an ETX value slowly going down
> and
> >> >> then dropping the link.
> >> >>
> >> >> Henning
> >> >>
> >> >> On Fri, Mar 28, 2014 at 10:11 AM, Saverio Proto <(spam-protected)>
> >> >> wrote:
> >> >> > Hello Russel,
> >> >> >
> >> >> > looking at this:
> >> >> >   https://personaltelco.net/~russell/olsrd/olsrd-routes-before.txt
> >> >> >   https://personaltelco.net/~russell/olsrd/olsrd-routes-during.txt
> >> >> >   https://personaltelco.net/~russell/olsrd/olsrd-routes-after.txt
> >> >> >
> >> >> > it looks like IPv6 routes are removed from the olsrd database. So I
> >> >> > is
> >> >> > actually the olsrd daemon involved.
> >> >> >
> >> >> > do you know if there is a previous stable version of olsrd where
> this
> >> >> > bug/behaviour is not present ?
> >> >> >
> >> >> > In my opinion the fastest way to track the bug is to try different
> >> >> > versions of olsrd with "git bisect" method.
> >> >> >
> >> >> > The first step is to tell us if there is a version of olsrd that is
> >> >> > not affected by this problem.
> >> >> >
> >> >> > thanks
> >> >> >
> >> >> > I cc: olsrd-dev
> >> >> >
> >> >> > Saverio
> >> >> >
> >> >> >
> >> >> > 2014-03-27 10:37 GMT+01:00 Russell Senior
> >> >> > <(spam-protected)>:
> >> >> >>>>>>> "Henning" == Henning Rogge <(spam-protected)
> >
> >> >> >>>>>>> writes:
> >> >> >>
> >> >> >> Henning> On 03/26/2014 07:41 PM, Russell Senior wrote:
> >> >> >>>> Anybody get a chance to look at the strace?  I see a:
> >> >> >>
> >> >> >> Henning> strace and packet dumps are much too lowlevel to directly
> >> >> >> Henning> hunt problems like this. Thats why Saverios question
> about
> >> >> >> Henning> txtinfo good, because it gives you a much more high-level
> >> >> >> Henning> view on what is going on.
> >> >> >>
> >> >> >> I had not installed the modules previously, so that interface
> wasn't
> >> >> >> immediately available.  It is now.
> >> >> >>
> >> >> >> [...]
> >> >> >>
> >> >> >> Henning> Okay, lets get back to the high-level view.
> >> >> >>
> >> >> >> Henning> To interpret the events you described we need a list of
> >> >> >> Henning> nodes, with their interface IPs and the connectivity
> >> >> >> between
> >> >> >> Henning> them.
> >> >> >>
> >> >> >> Here is the list of neighbors of 2001:470:e962::407.  The
> addresses
> >> >> >> listed are on the public wifi.  The OpenVPN addresses of each node
> >> >> >> are
> >> >> >> a permutation, e.g. if the public wifi addr is
> >> >> >> 2001:470:e962:wxyz::1,
> >> >> >> then the OpenVPN address of the node is 2001:470:e962::wxyz.
> >> >> >>
> >> >> >> None of the nodes connect directly, everything goes through ::407.
> >> >> >>
> >> >> >> From curl -6 http://localhost:$port/neighbors
> >> >> >>
> >> >> >>   https://personaltelco.net/~russell/olsrd/olsrd-neighbors.txt
> >> >> >>
> >> >> >> Henning> I am also a bit worried about your usage of bridges
> >> >> >> Henning> connected to mesh interfaces.  Normally you should no
> >> >> >> bridge
> >> >> >> Henning> any interface that OLSR uses for meshing.  Mixing routing
> >> >> >> Henning> (L3) and bridging (L2) can go wrong in very creative
> ways.
> >> >> >>
> >> >> >> I don't understand how the bridges could be a problem in this
> case.
> >> >> >> This is a hub and spoke topology.  One openvpn server in the
> middle,
> >> >> >> nodes at the edges.  None of the nodes interconnect otherwise.
>  Olsr
> >> >> >> is broadcast on the wifi in case there are any olsrd devices
> nearby,
> >> >> >> but, again, there is no overlap in the wifi coverage (and if there
> >> >> >> were physically, they are on different SSIDs and wouldn't overlap
> >> >> >> logically).
> >> >> >>
> >> >> >> Can you explain more about what in particularly would make you
> >> >> >> worry?
> >> >> >> This configuration has been stable for us on ipv4 for years and
> also
> >> >> >> on ipv6 until very recently, since late 2012 at least.  So, I
> >> >> >> suspect
> >> >> >> a bug.  Somewhere.
> >> >> >>
> >> >> >> Henning> Txtinfo output would be good (especially /route) would be
> >> >> >> Henning> good to see...  before the problem, during the problem
> and
> >> >> >> Henning> after the recovery.
> >> >> >>
> >> >> >> I'm using curl -6 http://localhost:$port/routes to get the
> following
> >> >> >> data, before, during and after turning on an ipv6 olsrd on a
> >> >> >> particular node (2001:470:e962:11c1::1).
> >> >> >>
> >> >> >>
> https://personaltelco.net/~russell/olsrd/olsrd-routes-before.txt
> >> >> >>
> https://personaltelco.net/~russell/olsrd/olsrd-routes-during.txt
> >> >> >>   https://personaltelco.net/~russell/olsrd/olsrd-routes-after.txt
> >> >> >>
> >> >> >> Henning> It would also help if you can reduce the number of nodes
> >> >> >> Henning> while still replicating the problem to a minimum.
> >> >> >>
> >> >> >> I don't have that level of control, unfortunately.  When I notice
> >> >> >> that
> >> >> >> the ipv6 routes have collapsed, I pick a likely seeming node
> (maybe
> >> >> >> because it had been plugged in recently) and turn off ipv6 olsrd,
> >> >> >> and
> >> >> >> over 30-60 seconds, magically the routes all come back.  My luck
> in
> >> >> >> guessing the right node to turn off is a little bit "too good", if
> >> >> >> you
> >> >> >> know what I mean, so that I am not sure there is anything
> >> >> >> particularly
> >> >> >> unique about the node I choose.  But, nevertheless, turning it off
> >> >> >> seems to help, generally.
> >> >> >>
> >> >> >> FWIW, I'm including olsrd versions here.  The central machine
> ::407
> >> >> >> is
> >> >> >> running 0.6.6.1, compiled from the tarball.  The nodes have the
> >> >> >> following versions, all built from openwrt routing feed sources.
> >> >> >>
> >> >> >>
> >> >> >>
> https://personaltelco.net/~russell/olsrd/olsrd-versions-by-node.txt
> >> >> >>
> >> >> >> Here is a table listing the frequency of each openwrt version:
> >> >> >>
> >> >> >>       1 0.6.3-3
> >> >> >>      33 0.6.4-1
> >> >> >>       1 0.6.5.1-1
> >> >> >>       1 0.6.5.1-2
> >> >> >>       7 0.6.5.2-1
> >> >> >>       1 0.6.5.3-1
> >> >> >>       2 0.6.5.4-1
> >> >> >>       2 0.6.6-2
> >> >> >>       7 0.6.6-3
> >> >> >>      11 0.6.6.1-1
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Russell Senior, President
> >> >> >> (spam-protected)
> >> >> >>
> >> >> >> --
> >> >> >> Olsr-users mailing list
> >> >> >> (spam-protected)
> >> >> >> https://lists.olsr.org/mailman/listinfo/olsr-users
> >> >> >
> >> >> > --
> >> >> > Olsr-dev mailing list
> >> >> > (spam-protected)
> >> >> > https://lists.olsr.org/mailman/listinfo/olsr-dev
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.olsr.org/pipermail/olsr-users/attachments/20140328/b3b50d99/attachment.html>


More information about the Olsr-users mailing list