[Olsr-dev] [Olsr-users] Route on repeater nodes sometimes break when gateway node reboots

Ferry Huberts (spam-protected)
Mon Mar 24 17:10:06 CET 2014


forgot the include the list...


As a quick reply,

You're on an old version of olsrd. Can you please try to reproduce with 
the latest from master?

Second, on your version it takes at least a minute (with default config) 
before a new gateway is chosen.

The extra features for sgw in master also include some under-the-hood 
bugfixes that I discovered during engineering of the extra features.

If the problem still persists on master, then we really need to look at it.

I'll read through your entire message later this week.

On 24/03/14 16:44, Ben West wrote:
> Hi Ferry,
>
> Thanks for responding.  I am including at the bottom the
> network/firewall/olsrd configs for the nodes in question.  Note that the
> nodes each broadcast 3 wireless virtual interfaces, an adhoc I/F and two
> APs (public and private).
>
> The symptom observed is that the repeater node described below, after
> rebooting the gateway node, becomes unable to ping anything beyond the
> gateway node's WAN IP.  Restarting olsrd on the repeater node usually
> resolves the problem, although in this instance I'm writing about, I had
> to restart olsrd on both nodes.  Also, rebooting the gateway node
> doesn't consistently trigger this problem on the repeater.  I think the
> problem may be more likely to occur when the gateway node undergoes an
> uncommanded reboot.
>
> UPDATE: At the onset of this problem, I noticed that the repeater node
> had a tunnel from SmartGateway named "tnl_05761352" listed by ifconfig.
> The gateway node just listed the generic tunnel "tunl0."  Restarting
> olsrd on the repeater and then the gateway node (in that order), did not
> cause "tnl_05761352" to be rebuilt; just a generic "tunl0" was listed on
> both nodes.  At this point, the repeater node was getting correct
> routing beyond the WAN.  Additionally, restarting olsrd on the repeater
> node a /2nd time/ did lead to a new tunnel named tnl_c0a80466 appearing,
> and routing across all nodes continued to work normally.
>
> Also, upon restarting olsrd on the gateway node, these messages appeared
> in syslog:
> Mar 24 15:14:02 MyGateway daemon.info <http://daemon.info> olsrd[1559]:
> Received netlink error code No such file or directory (-2)
> Mar 24 15:14:02 MyGateway daemon.err olsrd[1559]: Error on deleting
> policy rule aimed to activate RtTable 223!
>
> Do SmartGateway tunnel names somehow grow stale, such that a gateway
> node undergoing uncommanded reboot might leave repeater nodes with an
> orphaned half of their tunnel?
>
> Anyway, here is the olsrd config used by both nodes.
> -----
> config olsrd
>      # uncomment the following line to use a custom config file instead:
>      #option config_file '/etc/olsrd.conf'
>
>      option 'IpVersion' '4'
>      option 'LinkQualityLevel' '2'
>      option 'LinkQualityAlgorithm' 'etx_ffeth'
>      option 'SmartGateway' 'yes'
>      option 'Pollrate' '0.1'
>      option 'TcRedundancy'    '2'
>      option 'MprCoverage'    '5'
>
> config 'LoadPlugin'
>      option 'library' 'olsrd_arprefresh.so.0.1'
>
> config 'LoadPlugin'
>      option 'library' 'olsrd_dyn_gw.so.0.5'
>
> config 'LoadPlugin'
>      option 'library' 'olsrd_dyn_gw_plain.so.0.4'
>
> config 'LoadPlugin'
>    option 'library' 'olsrd_nameservice.so.0.3'
>    #option 'resolv_file' '/tmp/resolv.conf.auto'
>    option 'sighup_pid_file' '/var/run/dnsmasq.pid'
>    option 'suffix' '.mesh'
>
> config 'LoadPlugin'
>      option 'library' 'olsrd_txtinfo.so.0.1'
>      option 'accept' '0.0.0.0'
>
> config 'Interface'
>      list 'interface' 'mesh'
>      option 'Ip4Broadcast' '255.255.255.255'
>      option 'Mode' 'mesh'
> #
> -----
>
> Next, here is network / firewall config on the gateway node.  The blurb
> in firewall.user came from prior recommendation on this listserv about
> how to clamp MTU values for incoming, WAN-bound traffic in the
> SmartGateway tunnel.  That iptables rule is only used on the gateway node.
>
> ----
> /etc/config/network:
> config interface loopback
>      option ifname    lo
>      option proto    static
>      option ipaddr    127.0.0.1
>      option netmask    255.0.0.0
>
> config interface wan
>      option ifname    eth0
>      option proto    dhcp
>
> config interface 'mesh'
>      option proto 'static'
>      option ipaddr '5.118.19.82'
>      option dns    '208.67.222.222 208.67.222.220'
>      option netmask '255.0.0.0'
>      option broadcast '255.255.255.255'
>
> config interface 'ap1'
>      option proto 'static'
>      option type 'bridge'
>      option ipaddr '101.19.82.1'
>      option netmask '255.255.255.0'
>      option broadcast '101.19.82.255'
>
> config 'interface' 'ap2'
>      option 'proto' 'static'
>      option 'ipaddr' 102.19.82.1
>      option 'netmask' 255.255.255.0
>      option broadcast '102.19.82.255'
> ----
>
> /etc/config/firewall:
> config defaults
>      option syn_flood    1
>      option input        ACCEPT
>      option output        ACCEPT
>      option forward        REJECT
> # Uncomment this line to disable ipv6 rules
>      option disable_ipv6    1
>
> config zone
>      option name        wan
>      option network        'wan'
>      option input        ACCEPT
>      option output        ACCEPT
>      option forward        ACCEPT
>      option masq        1
>
> config zone
>      option name 'mesh'
>      option network 'mesh'
>      option input 'ACCEPT'
>      option output 'ACCEPT'
>      option forward 'ACCEPT'
>      option masq    1
>
> config zone
>      option name 'ap1'
>      option network 'ap1'
>      option input 'ACCEPT'
>      option output 'ACCEPT'
>      option forward 'DROP'
>
> config zone
>      option name 'ap2'
>      option network 'ap2'
>      option input 'ACCEPT'
>      option output 'ACCEPT'
>      option forward 'ACCEPT'
>
> config forwarding
>      option src 'mesh'
>      option dest 'wan'
>
> config forwarding
>       option src 'mesh'
>       option dest 'mesh'
>
> config forwarding
>       option src 'ap1'
>       option dest 'wan'
>
> config forwarding
>      option src 'ap1'
>      option dest 'ap1'
>
> config forwarding
>       option src 'ap2'
>       option dest 'wan'
>
> config forwarding
>       option src 'ap2'
>       option dest 'ap2'
>
> ----
> /etc/firewall.user:
> # Clamp all traffic leaving to MTU of OLSRd tunnel MTU
> iptables -t mangle -A POSTROUTING -o tnl_+ -p tcp --tcp-flags SYN,RST
> SYN -j TCPMSS --set-mss 1440
> ----
>
> Below is the gateway node's txtinfo output.  The 192.168.4.102 is the
> WAN IP of the gateway node.
>
> ----
> Table: Links
> Local IP    Remote IP    Hyst.    LQ    NLQ    Cost
> 5.118.19.82    5.211.164.176    0.00    0.874    1.000    1.143
>
> Table: Neighbors
> IP address    SYM    MPR    MPRS    Will.    2 Hop Neighbors
> 5.211.164.176    YES    NO    NO    3    0
>
> Table: Topology
> Dest. IP    Last hop IP    LQ    NLQ    Cost
> 5.211.164.176    5.118.19.82    0.874    1.000    1.143
> 5.118.19.82    5.211.164.176    1.000    0.874    1.143
>
> Table: HNA
> Destination    Gateway
> 0.0.0.0/0 <http://0.0.0.0/0>    5.118.19.82
> 0.0.0.0/0 <http://0.0.0.0/0>    5.118.19.82
>
> Table: MID
> IP address    Aliases
>
> Table: Routes
> Destination    Gateway IP    Metric    ETX    Interface
> 5.211.164.176/32 <http://5.211.164.176/32>    5.211.164.176    1
> 1.143    wlan0-2
> ----
>
> Likewise, here is network / firewall config on the repeater node.  The
> stanzas for the 'wan' logic interface remain in place, although they're
> not actually invoked for any physical interface.
> ----
> /etc/config/network:
> config interface loopback
>      option ifname    lo
>      option proto    static
>      option ipaddr    127.0.0.1
>      option netmask    255.0.0.0
>
> config interface wan
> #    option ifname    eth0
>      option proto    dhcp
>
> config interface 'mesh'
>      option proto 'static'
>      option ipaddr '5.211.164.176'
>      option dns    '208.67.222.222 208.67.222.220'
>      option netmask '255.0.0.0'
>      option broadcast '255.255.255.255'
>
> config interface 'ap1'
>      option proto 'static'
>      option type 'bridge'
>      option ipaddr '101.164.176.1'
>      option netmask '255.255.255.0'
>      option broadcast '101.164.176.255'
>
> config interface 'ap2'
>      option proto 'static'
>      option type 'bridge'
>      option ifname    eth0
>      option ipaddr '102.164.176.1'
>      option netmask '255.255.255.0'
>      option broadcast '102.164.176.255'
>
> ----
> /etc/config/firewall:
> config defaults
>      option syn_flood    1
>      option input        ACCEPT
>      option output        ACCEPT
>      option forward        REJECT
> # Uncomment this line to disable ipv6 rules
>      option disable_ipv6    1
>
> config zone
>      option name        wan
>      option network        'wan'
>      option input        ACCEPT
>      option output        ACCEPT
>      option forward        ACCEPT
>      option masq        1
>
> config zone
>      option name 'mesh'
>      option network 'mesh'
>      option input 'ACCEPT'
>      option output 'ACCEPT'
>      option forward 'ACCEPT'
>      option masq    1
>
> config zone
>      option name 'ap1'
>      option network 'ap1'
>      option input 'ACCEPT'
>      option output 'ACCEPT'
>      option forward 'DROP'
>      option masq    1
>
> config zone
>      option name 'ap2'
>      option network 'ap2'
>      option input 'ACCEPT'
>      option output 'ACCEPT'
>      option forward 'ACCEPT'
>      option masq    1
>
> config forwarding
>          option src 'mesh'
>          option dest 'mesh'
>
> config forwarding
>          option src 'ap1'
>          option dest 'mesh'
>
> config forwarding
>      option src 'ap1'
>      option dest 'ap1'
>
> config forwarding
>          option src 'ap2'
>          option dest 'mesh'
>
> config forwarding
>          option src 'ap2'
>          option dest 'ap2'
> ----
>
> Finally, below is the repeater node's txtinfo output.  Again, the
> 192.168.4.102 is the WAN IP of the gateway node.
>
> ----
> Table: Links
> Local IP    Remote IP    Hyst.    LQ    NLQ    Cost
> 5.211.164.176    5.118.19.82    0.00    1.000    0.886    1.128
>
> Table: Neighbors
> IP address    SYM    MPR    MPRS    Will.    2 Hop Neighbors
> 5.118.19.82    YES    NO    NO    3    0
>
> Table: Topology
> Dest. IP    Last hop IP    LQ    NLQ    Cost
> 5.211.164.176    5.118.19.82    0.886    1.000    1.128
> 5.118.19.82    5.211.164.176    1.000    0.886    1.128
>
> Table: HNA
> Destination    Gateway
> 0.0.0.0/0 <http://0.0.0.0/0>    5.118.19.82
>
> Table: MID
> IP address    Aliases
> 5.118.19.82    192.168.4.102
>
> Table: Routes
> Destination    Gateway IP    Metric    ETX    Interface
> 0.0.0.0/0 <http://0.0.0.0/0>    5.118.19.82    1    1.128    wlan0-2
> 5.118.19.82/32 <http://5.118.19.82/32>    5.118.19.82    1    1.128
> wlan0-2
> 192.168.4.102/32 <http://192.168.4.102/32>    5.118.19.82    1
> 1.128    wlan0-2
> ----
>
>
>
>
>
> On Mon, Mar 24, 2014 at 1:47 AM, Ferry Huberts <(spam-protected)
> <mailto:(spam-protected)>> wrote:
>
>     We have not seen this and we use smart gateway heavily.
>     Can you show configs of both nodes?
>
>
>     On 24/03/14 02:39, Ben West wrote:
>
>         I have seen sporadic instances of certain repeater nodes' (not all,
>         generally a small subset of all repeater nodes in a given mesh),
>         break
>         their route through the gateway node if the gateway node reboots
>         while
>         the repeater does not.
>
>         That is, the gateway node reboots, and the affected repeater node
>         thereafter appears to correctly re-establish its route thru the
>         gateway,
>         but the gateway doesn't actually route the repeater's traffic.
>           From the
>         affected node, I can ping the gateway's mesh IP and also the
>         gateway's
>         WAN IP, but I can't ping anything beyond the gateway node's WAN
>         interface.
>
>         Restarting olsrd on the repeater node seems to resolve this problem
>         consistently.
>
>         This is occurring on nodes running OpenWRT AA r39154 and OLSRd
>         v6.5-4,
>         using SmartGateway.  I'm quoting my /etc/config/olsrd below,
>         used on all
>         notes alike.
>
>         Has anyone else observed a similar problem?  Browsing the
>         changelog at
>         http://olsr.org/git/ since v6.5-4 doesn't show any mention of
>         explicit
>         SmartGateway bugfixes, just additional features.
>
>         -----
>         config olsrd
>               # uncomment the following line to use a custom config file
>         instead:
>               #option config_file '/etc/olsrd.conf'
>
>               option 'IpVersion' '4'
>               option 'LinkQualityLevel' '2'
>               option 'LinkQualityAlgorithm' 'etx_ffeth'
>               option 'SmartGateway' 'yes'
>               option 'Pollrate' '0.1'
>               option 'TcRedundancy'    '2'
>               option 'MprCoverage'    '5'
>
>         config 'LoadPlugin'
>               option 'library' 'olsrd_arprefresh.so.0.1'
>
>         config 'LoadPlugin'
>               option 'library' 'olsrd_dyn_gw.so.0.5'
>
>         config 'LoadPlugin'
>               option 'library' 'olsrd_dyn_gw_plain.so.0.4'
>
>         config 'LoadPlugin'
>             option 'library' 'olsrd_nameservice.so.0.3'
>             #option 'resolv_file' '/tmp/resolv.conf.auto'
>             option 'sighup_pid_file' '/var/run/dnsmasq.pid'
>             option 'suffix' '.mesh'
>
>         config 'LoadPlugin'
>               option 'library' 'olsrd_txtinfo.so.0.1'
>               option 'accept' '0.0.0.0'
>
>         config 'Interface'
>               list 'interface' 'mesh'
>               option 'Ip4Broadcast' '255.255.255.255'
>               option 'Mode' 'mesh'
>         #
>
>
>         --
>         Ben West
>         http://gowasabi.net
>         (spam-protected) <mailto:(spam-protected)>
>         <mailto:(spam-protected) <mailto:(spam-protected)>>
>         314-246-9434 <tel:314-246-9434>
>
>
>
>     --
>     Ferry Huberts
>
>
>
>
> --
> Ben West
> http://gowasabi.net
> (spam-protected) <mailto:(spam-protected)>
> 314-246-9434

-- 
Ferry Huberts




More information about the Olsr-dev mailing list