[Olsr-users] [Olsr-dev] olsrd 0.6.6.1 (and earlier) ipv6 problems

Russell Senior (spam-protected)
Fri Mar 28 12:26:21 CET 2014


It could be, but it isn't ;-)

Even with Mode "ether" commented out, I'm still seeing the same route
collapse behavior when too many devices are on.

Commenting it out did improve the route propagation to the "leaf" nodes
though.


On Fri, Mar 28, 2014 at 4:23 AM, Henning Rogge <(spam-protected)> wrote:

> Hi,
>
> mode "ether" should only be used by a group of OLSRd that run on the
> same ethernet switch... which means everyone can see everyone else.
>
> It suppress some forwarding of incoming OLSR messages because it
> assumes that everyone on this interface has already seen the message
> anyways.
>
> If the "hub" had "mode ether" activated it could be a good explanation
> what happened.
>
> Henning
>
> On Fri, Mar 28, 2014 at 12:01 PM, Russell Senior
> <(spam-protected)> wrote:
> > I should have done this earlier, but here are my olsrd.conf files.  On
> the
> > server:
> >
> > ===================================================
> > IpVersion 6
> >
> > #Hna4
> > #{
> > #}
> >
> > Hna6
> > {
> >         0::     0
> > }
> >
> > LinkQualityFishEye  0
> >
> > LoadPlugin "olsrd_txtinfo.so.0.1"
> > {
> >         PlParam "port" "7862"
> > }
> >
> > #############################################
> > ### OLSRD default interface configuration ###
> > #############################################
> > # the default interface section can have the same values as the following
> > # interface configuration. It will allow you so set common options for
> all
> > # interfaces.
> >
> > InterfaceDefaults {
> >         # Ip4Broadcast      255.255.255.255
> > }
> >
> > Interface "ptp" "ptp-udp" "vpn" "iris"
> > {
> > #       Mode "ether"
> > }
> > =====================================================
> >
> > I am pretty sure that Hna4 { } part had been there uncommented for a
> while.
> > The Mode "ether" was uncommented too.  When I commented them out, as
> above,
> > and restart I see the individual routes on the client, as you would
> expect.
> > I had noticed the "route aggregation" and been a little surprised, but
> > having just moved to a newer version, I wasn't too suspicious.
> >
> > On the clients:
> >
> > =====================================================
> >
> > IpVersion 6
> >
> > LinkQualityFishEye 0
> >
> > Hna6
> > {
> >         2001:470:e962:xxyy::    64
> > }
> >
> > LoadPlugin "olsrd_txtinfo.so.0.1"
> > {
> >         PlParam "port" "7862"
> > }
> >
> > Interface "br-pub" "ptp"
> > {
> > }
> > =====================================================
> >
> > When it's working, I see 177 olsrd routes (the 180 figure included some
> > header/footer lines, apparently) on the server and 176 on the client.
>  But
> > if I add another node, the routes all collapse still.  It is confusing
> > though.  Sometimes, I only see two routes, as below, apparently when Mode
> > "ether" is in force.  It's confusing because sometimes I was seeing the
> more
> > complete client routing table even with Mode "ether".
> >
> > Table: Routes
> > Destination     Gateway IP      Metric  ETX     Interface
> > ::/0    2001:470:e962::407      1       1.000   ptp
> > 2001:470:e962::407/128  2001:470:e962::407      1       1.000   ptp
> >
> > I am turning Mode "ether" off again, and I seem to get a complete set of
> > routes (one less than the server) on the clients.
> >
> > Again, though, if I add one more node, the routes on both the server and
> > clients collapse.  The clients go to zero.  The server has routes to one
> or
> > sometimes two clients, which vary a little bit.
> >
> >
> >
> >
> > On Fri, Mar 28, 2014 at 3:09 AM, Henning Rogge <(spam-protected)> wrote:
> >>
> >> Each leaf should have a /128 route for each other leaf...
> >>
> >> Olsrd does NOT do any route aggregation.
> >>
> >> Can you show me a routing table of a leaf and the txtinfo output when
> >> everything is fine?
> >>
> >> Henning
> >>
> >> On Fri, Mar 28, 2014 at 11:06 AM, Russell Senior
> >> <(spam-protected)> wrote:
> >> > FWIW, the ipv6 routing tables on the "leaf" nodes are quite short,
> with
> >> > mostly just a default route pointing at the central server, when olsrd
> >> > is
> >> > working.  When the central server has the route collapse, the default
> >> > route
> >> > on the "leaf" nodes disappears.
> >> >
> >> > I am thinking about memory exhaustion, maybe something his helpfully
> >> > killing
> >> > it off when the size becomes "too large" ... /me goes to look for
> >> > evidence
> >> > of that.
> >> >
> >> >
> >> > On Fri, Mar 28, 2014 at 3:03 AM, Russell Senior
> >> > <(spam-protected)>
> >> > wrote:
> >> >>
> >> >> The are single hop from the central server, which is the table I've
> >> >> been
> >> >> posting.
> >> >>
> >> >>
> >> >> On Fri, Mar 28, 2014 at 3:01 AM, Henning Rogge <(spam-protected)>
> >> >> wrote:
> >> >>>
> >> >>> What?
> >> >>>
> >> >>> but your routing tables only contains "ETX 1.0" paths... which means
> >> >>> they are single hop!
> >> >>>
> >> >>> Henning
> >> >>>
> >> >>> On Fri, Mar 28, 2014 at 11:00 AM, Russell Senior
> >> >>> <(spam-protected)> wrote:
> >> >>> > Without the ipv6 olsrd, the nodes can't route to each other, it
> >> >>> > seems.
> >> >>> > I
> >> >>> > picked two I had turned off, and tried ping6'ing between them and
> >> >>> > got
> >> >>> > 100%
> >> >>> > packet loss.
> >> >>> >
> >> >>> >
> >> >>> > On Fri, Mar 28, 2014 at 2:54 AM, Henning Rogge <(spam-protected)>
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> Hi,
> >> >>> >>
> >> >>> >> as far as I can see each "leaf" node can see each other leaf node
> >> >>> >> over
> >> >>> >> the OpenVPN, right?
> >> >>> >>
> >> >>> >> So you are only using Olsrd to distribute HNAs?
> >> >>> >>
> >> >>> >> Henning Rogge
> >> >>> >>
> >> >>> >> On Fri, Mar 28, 2014 at 10:48 AM, Russell Senior
> >> >>> >> <(spam-protected)> wrote:
> >> >>> >> > The central server, ::407, is running OpenVPN in server mode.
> >> >>> >> > The
> >> >>> >> > "leaf"
> >> >>> >> > nodes all connect to it via OpenVPN client mode with a tap
> >> >>> >> > interface.
> >> >>> >> > We
> >> >>> >> > statically provision the IPv6 addresses on the vpn.
> >> >>> >> >
> >> >>> >> > And yes, the OpenVPN links are still active.  We are running an
> >> >>> >> > IPv4
> >> >>> >> > instance of olsrd (same version) in parallel and those routes
> (to
> >> >>> >> > the
> >> >>> >> > very
> >> >>> >> > same devices) are not affected.
> >> >>> >> >
> >> >>> >> > We see the problem when particular (though varying) nodes olsrd
> >> >>> >> > ipv6
> >> >>> >> > instances are started/stopped.  Sometimes the nodes are running
> >> >>> >> > 0.6.6.1,
> >> >>> >> > and
> >> >>> >> > sometimes 0.6.4.  It doesn't seem to be specific.  The central
> >> >>> >> > server is
> >> >>> >> > running 0.6.6.1 now, but we saw the same thing earlier (which
> is
> >> >>> >> > why
> >> >>> >> > I
> >> >>> >> > upgraded) on 0.6.4.
> >> >>> >> >
> >> >>> >> > One other potential clue (it doesn't make very much sense,
> >> >>> >> > because I
> >> >>> >> > know
> >> >>> >> > there are much bigger networks than ours), I've never seen more
> >> >>> >> > than
> >> >>> >> > 186
> >> >>> >> > ipv6 routes on ::407.  We seem to see the problem when we try
> to
> >> >>> >> > exceed
> >> >>> >> > that.  I'm going to try to confirm that.
> >> >>> >> >
> >> >>> >> >
> >> >>> >> > On Fri, Mar 28, 2014 at 2:34 AM, Henning Rogge <
> (spam-protected)>
> >> >>> >> > wrote:
> >> >>> >> >>
> >> >>> >> >> Hi,
> >> >>> >> >>
> >> >>> >> >> I must admit that I am not convinced that its an Olsrd bug
> what
> >> >>> >> >> we
> >> >>> >> >> are
> >> >>> >> >> seeing...
> >> >>> >> >>
> >> >>> >> >> If I see it correctly Olsrd is running over the VPN interface
> >> >>> >> >> connection (interface name "vpn"), right?
> >> >>> >> >>
> >> >>> >> >> Is the VPN connection between the nodes still active during
> the
> >> >>> >> >> route
> >> >>> >> >> loss? Most of the nodes seem to have direct connections and
> the
> >> >>> >> >> "30
> >> >>> >> >> seconds until recovery" sounds like an ETX value slowly going
> >> >>> >> >> down
> >> >>> >> >> and
> >> >>> >> >> then dropping the link.
> >> >>> >> >>
> >> >>> >> >> Henning
> >> >>> >> >>
> >> >>> >> >> On Fri, Mar 28, 2014 at 10:11 AM, Saverio Proto
> >> >>> >> >> <(spam-protected)>
> >> >>> >> >> wrote:
> >> >>> >> >> > Hello Russel,
> >> >>> >> >> >
> >> >>> >> >> > looking at this:
> >> >>> >> >> >
> >> >>> >> >> >
> >> >>> >> >> >
> https://personaltelco.net/~russell/olsrd/olsrd-routes-before.txt
> >> >>> >> >> >
> >> >>> >> >> >
> >> >>> >> >> >
> https://personaltelco.net/~russell/olsrd/olsrd-routes-during.txt
> >> >>> >> >> >
> >> >>> >> >> >
> https://personaltelco.net/~russell/olsrd/olsrd-routes-after.txt
> >> >>> >> >> >
> >> >>> >> >> > it looks like IPv6 routes are removed from the olsrd
> database.
> >> >>> >> >> > So
> >> >>> >> >> > I
> >> >>> >> >> > is
> >> >>> >> >> > actually the olsrd daemon involved.
> >> >>> >> >> >
> >> >>> >> >> > do you know if there is a previous stable version of olsrd
> >> >>> >> >> > where
> >> >>> >> >> > this
> >> >>> >> >> > bug/behaviour is not present ?
> >> >>> >> >> >
> >> >>> >> >> > In my opinion the fastest way to track the bug is to try
> >> >>> >> >> > different
> >> >>> >> >> > versions of olsrd with "git bisect" method.
> >> >>> >> >> >
> >> >>> >> >> > The first step is to tell us if there is a version of olsrd
> >> >>> >> >> > that
> >> >>> >> >> > is
> >> >>> >> >> > not affected by this problem.
> >> >>> >> >> >
> >> >>> >> >> > thanks
> >> >>> >> >> >
> >> >>> >> >> > I cc: olsrd-dev
> >> >>> >> >> >
> >> >>> >> >> > Saverio
> >> >>> >> >> >
> >> >>> >> >> >
> >> >>> >> >> > 2014-03-27 10:37 GMT+01:00 Russell Senior
> >> >>> >> >> > <(spam-protected)>:
> >> >>> >> >> >>>>>>> "Henning" == Henning Rogge
> >> >>> >> >> >>>>>>> <(spam-protected)>
> >> >>> >> >> >>>>>>> writes:
> >> >>> >> >> >>
> >> >>> >> >> >> Henning> On 03/26/2014 07:41 PM, Russell Senior wrote:
> >> >>> >> >> >>>> Anybody get a chance to look at the strace?  I see a:
> >> >>> >> >> >>
> >> >>> >> >> >> Henning> strace and packet dumps are much too lowlevel to
> >> >>> >> >> >> directly
> >> >>> >> >> >> Henning> hunt problems like this. Thats why Saverios
> question
> >> >>> >> >> >> about
> >> >>> >> >> >> Henning> txtinfo good, because it gives you a much more
> >> >>> >> >> >> high-level
> >> >>> >> >> >> Henning> view on what is going on.
> >> >>> >> >> >>
> >> >>> >> >> >> I had not installed the modules previously, so that
> interface
> >> >>> >> >> >> wasn't
> >> >>> >> >> >> immediately available.  It is now.
> >> >>> >> >> >>
> >> >>> >> >> >> [...]
> >> >>> >> >> >>
> >> >>> >> >> >> Henning> Okay, lets get back to the high-level view.
> >> >>> >> >> >>
> >> >>> >> >> >> Henning> To interpret the events you described we need a
> list
> >> >>> >> >> >> of
> >> >>> >> >> >> Henning> nodes, with their interface IPs and the
> connectivity
> >> >>> >> >> >> between
> >> >>> >> >> >> Henning> them.
> >> >>> >> >> >>
> >> >>> >> >> >> Here is the list of neighbors of 2001:470:e962::407.  The
> >> >>> >> >> >> addresses
> >> >>> >> >> >> listed are on the public wifi.  The OpenVPN addresses of
> each
> >> >>> >> >> >> node
> >> >>> >> >> >> are
> >> >>> >> >> >> a permutation, e.g. if the public wifi addr is
> >> >>> >> >> >> 2001:470:e962:wxyz::1,
> >> >>> >> >> >> then the OpenVPN address of the node is
> 2001:470:e962::wxyz.
> >> >>> >> >> >>
> >> >>> >> >> >> None of the nodes connect directly, everything goes through
> >> >>> >> >> >> ::407.
> >> >>> >> >> >>
> >> >>> >> >> >> From curl -6 http://localhost:$port/neighbors
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >>
> https://personaltelco.net/~russell/olsrd/olsrd-neighbors.txt
> >> >>> >> >> >>
> >> >>> >> >> >> Henning> I am also a bit worried about your usage of
> bridges
> >> >>> >> >> >> Henning> connected to mesh interfaces.  Normally you should
> >> >>> >> >> >> no
> >> >>> >> >> >> bridge
> >> >>> >> >> >> Henning> any interface that OLSR uses for meshing.  Mixing
> >> >>> >> >> >> routing
> >> >>> >> >> >> Henning> (L3) and bridging (L2) can go wrong in very
> creative
> >> >>> >> >> >> ways.
> >> >>> >> >> >>
> >> >>> >> >> >> I don't understand how the bridges could be a problem in
> this
> >> >>> >> >> >> case.
> >> >>> >> >> >> This is a hub and spoke topology.  One openvpn server in
> the
> >> >>> >> >> >> middle,
> >> >>> >> >> >> nodes at the edges.  None of the nodes interconnect
> >> >>> >> >> >> otherwise.
> >> >>> >> >> >> Olsr
> >> >>> >> >> >> is broadcast on the wifi in case there are any olsrd
> devices
> >> >>> >> >> >> nearby,
> >> >>> >> >> >> but, again, there is no overlap in the wifi coverage (and
> if
> >> >>> >> >> >> there
> >> >>> >> >> >> were physically, they are on different SSIDs and wouldn't
> >> >>> >> >> >> overlap
> >> >>> >> >> >> logically).
> >> >>> >> >> >>
> >> >>> >> >> >> Can you explain more about what in particularly would make
> >> >>> >> >> >> you
> >> >>> >> >> >> worry?
> >> >>> >> >> >> This configuration has been stable for us on ipv4 for years
> >> >>> >> >> >> and
> >> >>> >> >> >> also
> >> >>> >> >> >> on ipv6 until very recently, since late 2012 at least.
>  So, I
> >> >>> >> >> >> suspect
> >> >>> >> >> >> a bug.  Somewhere.
> >> >>> >> >> >>
> >> >>> >> >> >> Henning> Txtinfo output would be good (especially /route)
> >> >>> >> >> >> would
> >> >>> >> >> >> be
> >> >>> >> >> >> Henning> good to see...  before the problem, during the
> >> >>> >> >> >> problem
> >> >>> >> >> >> and
> >> >>> >> >> >> Henning> after the recovery.
> >> >>> >> >> >>
> >> >>> >> >> >> I'm using curl -6 http://localhost:$port/routes to get the
> >> >>> >> >> >> following
> >> >>> >> >> >> data, before, during and after turning on an ipv6 olsrd on
> a
> >> >>> >> >> >> particular node (2001:470:e962:11c1::1).
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >>
> https://personaltelco.net/~russell/olsrd/olsrd-routes-before.txt
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >>
> https://personaltelco.net/~russell/olsrd/olsrd-routes-during.txt
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >>
> https://personaltelco.net/~russell/olsrd/olsrd-routes-after.txt
> >> >>> >> >> >>
> >> >>> >> >> >> Henning> It would also help if you can reduce the number of
> >> >>> >> >> >> nodes
> >> >>> >> >> >> Henning> while still replicating the problem to a minimum.
> >> >>> >> >> >>
> >> >>> >> >> >> I don't have that level of control, unfortunately.  When I
> >> >>> >> >> >> notice
> >> >>> >> >> >> that
> >> >>> >> >> >> the ipv6 routes have collapsed, I pick a likely seeming
> node
> >> >>> >> >> >> (maybe
> >> >>> >> >> >> because it had been plugged in recently) and turn off ipv6
> >> >>> >> >> >> olsrd,
> >> >>> >> >> >> and
> >> >>> >> >> >> over 30-60 seconds, magically the routes all come back.  My
> >> >>> >> >> >> luck
> >> >>> >> >> >> in
> >> >>> >> >> >> guessing the right node to turn off is a little bit "too
> >> >>> >> >> >> good",
> >> >>> >> >> >> if
> >> >>> >> >> >> you
> >> >>> >> >> >> know what I mean, so that I am not sure there is anything
> >> >>> >> >> >> particularly
> >> >>> >> >> >> unique about the node I choose.  But, nevertheless, turning
> >> >>> >> >> >> it
> >> >>> >> >> >> off
> >> >>> >> >> >> seems to help, generally.
> >> >>> >> >> >>
> >> >>> >> >> >> FWIW, I'm including olsrd versions here.  The central
> machine
> >> >>> >> >> >> ::407
> >> >>> >> >> >> is
> >> >>> >> >> >> running 0.6.6.1, compiled from the tarball.  The nodes have
> >> >>> >> >> >> the
> >> >>> >> >> >> following versions, all built from openwrt routing feed
> >> >>> >> >> >> sources.
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >>
> https://personaltelco.net/~russell/olsrd/olsrd-versions-by-node.txt
> >> >>> >> >> >>
> >> >>> >> >> >> Here is a table listing the frequency of each openwrt
> >> >>> >> >> >> version:
> >> >>> >> >> >>
> >> >>> >> >> >>       1 0.6.3-3
> >> >>> >> >> >>      33 0.6.4-1
> >> >>> >> >> >>       1 0.6.5.1-1
> >> >>> >> >> >>       1 0.6.5.1-2
> >> >>> >> >> >>       7 0.6.5.2-1
> >> >>> >> >> >>       1 0.6.5.3-1
> >> >>> >> >> >>       2 0.6.5.4-1
> >> >>> >> >> >>       2 0.6.6-2
> >> >>> >> >> >>       7 0.6.6-3
> >> >>> >> >> >>      11 0.6.6.1-1
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >> --
> >> >>> >> >> >> Russell Senior, President
> >> >>> >> >> >> (spam-protected)
> >> >>> >> >> >>
> >> >>> >> >> >> --
> >> >>> >> >> >> Olsr-users mailing list
> >> >>> >> >> >> (spam-protected)
> >> >>> >> >> >> https://lists.olsr.org/mailman/listinfo/olsr-users
> >> >>> >> >> >
> >> >>> >> >> > --
> >> >>> >> >> > Olsr-dev mailing list
> >> >>> >> >> > (spam-protected)
> >> >>> >> >> > https://lists.olsr.org/mailman/listinfo/olsr-dev
> >> >>> >> >
> >> >>> >> >
> >> >>> >
> >> >>> >
> >> >>
> >> >>
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.olsr.org/pipermail/olsr-users/attachments/20140328/f2fc4037/attachment.html>


More information about the Olsr-users mailing list