[Olsr-users] Route on repeater nodes sometimes break when gateway node reboots
Ben West
(spam-protected)
Mon Mar 24 16:56:17 CET 2014
Also, a further follow-up, along with a response directly back to the
list. I did notice a difference on the HNA gateway reported by each node's
txtinfo before and after restarting olsrd. I'm including both nodes'
txtoutput below, during the period where the repeater node can't ping
beyond the gateway's WAN, and after restarting olsrd on both nodes, when
routing worked as expected.
The specific problem appeared to be that both nodes had the gateway's mesh
IP address listed as the gateway in the HNA table, instead of its WAN IP.
This would seem to explain the repeater node's inability to access the WAN.
No node configuration was changed during this time. All that happened was
the gateway node underwent a power-cycle (uncommanded reboot), and later I
manually restarted olsrd on both nodes.
Are situations like these perhaps the purpose for the freifunk-gwcheck
package in OpenWRT? (Neither node has this installed.)
Gateway node's txtinfo output, while repeater node 5.211.164.176 can't
access WAN:
----
Table: Links
Local IP Remote IP Hyst. LQ NLQ Cost
5.118.19.82 5.211.164.176 0.00 0.874 1.000 1.143
Table: Neighbors
IP address SYM MPR MPRS Will. 2 Hop Neighbors
5.211.164.176 YES NO NO 3 0
Table: Topology
Dest. IP Last hop IP LQ NLQ Cost
5.211.164.176 5.118.19.82 0.874 1.000 1.143
5.118.19.82 5.211.164.176 1.000 0.874 1.143
Table: HNA
Destination Gateway
0.0.0.0/0 5.118.19.82
0.0.0.0/0 5.118.19.82
Table: MID
IP address Aliases
Table: Routes
Destination Gateway IP Metric ETX Interface
5.211.164.176/32 5.211.164.176 1 1.143 wlan0-2
----
Repeater node's txtinfo output, while it can't access WAN:
----
Table: Links
Local IP Remote IP Hyst. LQ NLQ Cost
5.211.164.176 5.118.19.82 0.00 1.000 0.886 1.128
Table: Neighbors
IP address SYM MPR MPRS Will. 2 Hop Neighbors
5.118.19.82 YES NO NO 3 0
Table: Topology
Dest. IP Last hop IP LQ NLQ Cost
5.211.164.176 5.118.19.82 0.886 1.000 1.128
5.118.19.82 5.211.164.176 1.000 0.886 1.128
Table: HNA
Destination Gateway
0.0.0.0/0 5.118.19.82
Table: MID
IP address Aliases
5.118.19.82 192.168.4.102
Table: Routes
Destination Gateway IP Metric ETX Interface
0.0.0.0/0 5.118.19.82 1 1.128 wlan0-2
5.118.19.82/32 5.118.19.82 1 1.128 wlan0-2
192.168.4.102/32 5.118.19.82 1 1.128 wlan0-2
----
Gateway node's txtinfo output, after restarted olsrd:
----
Table: Links
Local IP Remote IP Hyst. LQ NLQ Cost
5.118.19.82 5.211.164.176 0.00 1.000 1.000 1.000
Table: Neighbors
IP address SYM MPR MPRS Will. 2 Hop Neighbors
5.211.164.176 YES NO NO 3 0
Table: Topology
Dest. IP Last hop IP LQ NLQ Cost
192.168.4.102 5.211.164.176 1.000 1.000 1.000
5.211.164.176 192.168.4.102 1.000 1.000 1.000
Table: HNA
Destination Gateway
0.0.0.0/0 192.168.4.102
0.0.0.0/0 192.168.4.102
Table: MID
IP address Aliases
Table: Routes
Destination Gateway IP Metric ETX Interface
5.211.164.176/32 5.211.164.176 1 1.000 wlan0-2
----
Repeater node's txtinfo output, after restarting its olsrd:
----
Table: Links
Local IP Remote IP Hyst. LQ NLQ Cost
5.211.164.176 5.118.19.82 0.00 1.000 0.886 1.128
Table: Neighbors
IP address SYM MPR MPRS Will. 2 Hop Neighbors
192.168.4.102 YES NO NO 3 0
Table: Topology
Dest. IP Last hop IP LQ NLQ Cost
192.168.4.102 5.211.164.176 1.000 0.886 1.128
5.211.164.176 192.168.4.102 0.886 1.000 1.128
Table: HNA
Destination Gateway
0.0.0.0/0 192.168.4.102
Table: MID
IP address Aliases
192.168.4.102 5.118.19.82
Table: Routes
Destination Gateway IP Metric ETX Interface
0.0.0.0/0 5.118.19.82 1 1.128 wlan0-2
5.118.19.82/32 5.118.19.82 1 1.128 wlan0-2
192.168.4.102/32 5.118.19.82 1 1.128 wlan0-2
----
On Mon, Mar 24, 2014 at 10:44 AM, Ben West <(spam-protected)> wrote:
> Hi Ferry,
>
> Thanks for responding. I am including at the bottom the
> network/firewall/olsrd configs for the nodes in question. Note that the
> nodes each broadcast 3 wireless virtual interfaces, an adhoc I/F and two
> APs (public and private).
>
> The symptom observed is that the repeater node described below, after
> rebooting the gateway node, becomes unable to ping anything beyond the
> gateway node's WAN IP. Restarting olsrd on the repeater node usually
> resolves the problem, although in this instance I'm writing about, I had to
> restart olsrd on both nodes. Also, rebooting the gateway node doesn't
> consistently trigger this problem on the repeater. I think the problem may
> be more likely to occur when the gateway node undergoes an uncommanded
> reboot.
>
> UPDATE: At the onset of this problem, I noticed that the repeater node had
> a tunnel from SmartGateway named "tnl_05761352" listed by ifconfig. The
> gateway node just listed the generic tunnel "tunl0." Restarting olsrd on
> the repeater and then the gateway node (in that order), did not
> cause "tnl_05761352" to be rebuilt; just a generic "tunl0" was listed on
> both nodes. At this point, the repeater node was getting correct routing
> beyond the WAN. Additionally, restarting olsrd on the repeater node a *2nd
> time* did lead to a new tunnel named tnl_c0a80466 appearing, and routing
> across all nodes continued to work normally.
>
> Also, upon restarting olsrd on the gateway node, these messages appeared
> in syslog:
> Mar 24 15:14:02 MyGateway daemon.info olsrd[1559]: Received netlink error
> code No such file or directory (-2)
> Mar 24 15:14:02 MyGateway daemon.err olsrd[1559]: Error on deleting policy
> rule aimed to activate RtTable 223!
>
> Do SmartGateway tunnel names somehow grow stale, such that a gateway node
> undergoing uncommanded reboot might leave repeater nodes with an orphaned
> half of their tunnel?
>
> Anyway, here is the olsrd config used by both nodes.
>
> -----
> config olsrd
> # uncomment the following line to use a custom config file instead:
> #option config_file '/etc/olsrd.conf'
>
> option 'IpVersion' '4'
> option 'LinkQualityLevel' '2'
> option 'LinkQualityAlgorithm' 'etx_ffeth'
> option 'SmartGateway' 'yes'
> option 'Pollrate' '0.1'
> option 'TcRedundancy' '2'
> option 'MprCoverage' '5'
>
> config 'LoadPlugin'
> option 'library' 'olsrd_arprefresh.so.0.1'
>
> config 'LoadPlugin'
> option 'library' 'olsrd_dyn_gw.so.0.5'
>
> config 'LoadPlugin'
> option 'library' 'olsrd_dyn_gw_plain.so.0.4'
>
> config 'LoadPlugin'
> option 'library' 'olsrd_nameservice.so.0.3'
> #option 'resolv_file' '/tmp/resolv.conf.auto'
> option 'sighup_pid_file' '/var/run/dnsmasq.pid'
> option 'suffix' '.mesh'
>
> config 'LoadPlugin'
> option 'library' 'olsrd_txtinfo.so.0.1'
> option 'accept' '0.0.0.0'
>
> config 'Interface'
> list 'interface' 'mesh'
> option 'Ip4Broadcast' '255.255.255.255'
> option 'Mode' 'mesh'
> #
> -----
>
> Next, here is network / firewall config on the gateway node. The blurb in
> firewall.user came from prior recommendation on this listserv about how to
> clamp MTU values for incoming, WAN-bound traffic in the SmartGateway
> tunnel. That iptables rule is only used on the gateway node.
>
> ----
> /etc/config/network:
> config interface loopback
> option ifname lo
> option proto static
> option ipaddr 127.0.0.1
> option netmask 255.0.0.0
>
> config interface wan
> option ifname eth0
> option proto dhcp
>
> config interface 'mesh'
> option proto 'static'
> option ipaddr '5.118.19.82'
> option dns '208.67.222.222 208.67.222.220'
> option netmask '255.0.0.0'
> option broadcast '255.255.255.255'
>
> config interface 'ap1'
> option proto 'static'
> option type 'bridge'
> option ipaddr '101.19.82.1'
> option netmask '255.255.255.0'
> option broadcast '101.19.82.255'
>
> config 'interface' 'ap2'
> option 'proto' 'static'
> option 'ipaddr' 102.19.82.1
> option 'netmask' 255.255.255.0
> option broadcast '102.19.82.255'
> ----
>
> /etc/config/firewall:
> config defaults
> option syn_flood 1
> option input ACCEPT
> option output ACCEPT
> option forward REJECT
> # Uncomment this line to disable ipv6 rules
> option disable_ipv6 1
>
> config zone
> option name wan
> option network 'wan'
> option input ACCEPT
> option output ACCEPT
> option forward ACCEPT
> option masq 1
>
> config zone
> option name 'mesh'
> option network 'mesh'
> option input 'ACCEPT'
> option output 'ACCEPT'
> option forward 'ACCEPT'
> option masq 1
>
> config zone
> option name 'ap1'
> option network 'ap1'
> option input 'ACCEPT'
> option output 'ACCEPT'
> option forward 'DROP'
>
> config zone
> option name 'ap2'
> option network 'ap2'
> option input 'ACCEPT'
> option output 'ACCEPT'
> option forward 'ACCEPT'
>
> config forwarding
> option src 'mesh'
> option dest 'wan'
>
> config forwarding
> option src 'mesh'
> option dest 'mesh'
>
> config forwarding
> option src 'ap1'
> option dest 'wan'
>
> config forwarding
> option src 'ap1'
> option dest 'ap1'
>
> config forwarding
> option src 'ap2'
> option dest 'wan'
>
> config forwarding
> option src 'ap2'
> option dest 'ap2'
>
> ----
> /etc/firewall.user:
> # Clamp all traffic leaving to MTU of OLSRd tunnel MTU
> iptables -t mangle -A POSTROUTING -o tnl_+ -p tcp --tcp-flags SYN,RST SYN
> -j TCPMSS --set-mss 1440
> ----
>
> Below is the gateway node's txtinfo output. The 192.168.4.102 is the WAN
> IP of the gateway node.
>
> ----
> Table: Links
> Local IP Remote IP Hyst. LQ NLQ Cost
> 5.118.19.82 5.211.164.176 0.00 0.874 1.000 1.143
>
> Table: Neighbors
> IP address SYM MPR MPRS Will. 2 Hop Neighbors
> 5.211.164.176 YES NO NO 3 0
>
> Table: Topology
> Dest. IP Last hop IP LQ NLQ Cost
> 5.211.164.176 5.118.19.82 0.874 1.000 1.143
> 5.118.19.82 5.211.164.176 1.000 0.874 1.143
>
> Table: HNA
> Destination Gateway
> 0.0.0.0/0 5.118.19.82
> 0.0.0.0/0 5.118.19.82
>
> Table: MID
> IP address Aliases
>
> Table: Routes
> Destination Gateway IP Metric ETX Interface
> 5.211.164.176/32 5.211.164.176 1 1.143 wlan0-2
> ----
>
> Likewise, here is network / firewall config on the repeater node. The
> stanzas for the 'wan' logic interface remain in place, although they're not
> actually invoked for any physical interface.
> ----
> /etc/config/network:
> config interface loopback
> option ifname lo
> option proto static
> option ipaddr 127.0.0.1
> option netmask 255.0.0.0
>
> config interface wan
> # option ifname eth0
> option proto dhcp
>
> config interface 'mesh'
> option proto 'static'
> option ipaddr '5.211.164.176'
> option dns '208.67.222.222 208.67.222.220'
> option netmask '255.0.0.0'
> option broadcast '255.255.255.255'
>
> config interface 'ap1'
> option proto 'static'
> option type 'bridge'
> option ipaddr '101.164.176.1'
> option netmask '255.255.255.0'
> option broadcast '101.164.176.255'
>
> config interface 'ap2'
> option proto 'static'
> option type 'bridge'
> option ifname eth0
> option ipaddr '102.164.176.1'
> option netmask '255.255.255.0'
> option broadcast '102.164.176.255'
>
> ----
> /etc/config/firewall:
> config defaults
> option syn_flood 1
> option input ACCEPT
> option output ACCEPT
> option forward REJECT
> # Uncomment this line to disable ipv6 rules
> option disable_ipv6 1
>
> config zone
> option name wan
> option network 'wan'
> option input ACCEPT
> option output ACCEPT
> option forward ACCEPT
> option masq 1
>
> config zone
> option name 'mesh'
> option network 'mesh'
> option input 'ACCEPT'
> option output 'ACCEPT'
> option forward 'ACCEPT'
> option masq 1
>
> config zone
> option name 'ap1'
> option network 'ap1'
> option input 'ACCEPT'
> option output 'ACCEPT'
> option forward 'DROP'
> option masq 1
>
> config zone
> option name 'ap2'
> option network 'ap2'
> option input 'ACCEPT'
> option output 'ACCEPT'
> option forward 'ACCEPT'
> option masq 1
>
> config forwarding
> option src 'mesh'
> option dest 'mesh'
>
> config forwarding
> option src 'ap1'
> option dest 'mesh'
>
> config forwarding
> option src 'ap1'
> option dest 'ap1'
>
> config forwarding
> option src 'ap2'
> option dest 'mesh'
>
> config forwarding
> option src 'ap2'
> option dest 'ap2'
> ----
>
> Finally, below is the repeater node's txtinfo output. Again, the
> 192.168.4.102 is the WAN IP of the gateway node.
>
> ----
> Table: Links
> Local IP Remote IP Hyst. LQ NLQ Cost
> 5.211.164.176 5.118.19.82 0.00 1.000 0.886 1.128
>
> Table: Neighbors
> IP address SYM MPR MPRS Will. 2 Hop Neighbors
> 5.118.19.82 YES NO NO 3 0
>
> Table: Topology
> Dest. IP Last hop IP LQ NLQ Cost
> 5.211.164.176 5.118.19.82 0.886 1.000 1.128
> 5.118.19.82 5.211.164.176 1.000 0.886 1.128
>
> Table: HNA
> Destination Gateway
> 0.0.0.0/0 5.118.19.82
>
> Table: MID
> IP address Aliases
> 5.118.19.82 192.168.4.102
>
> Table: Routes
> Destination Gateway IP Metric ETX Interface
> 0.0.0.0/0 5.118.19.82 1 1.128 wlan0-2
> 5.118.19.82/32 5.118.19.82 1 1.128 wlan0-2
> 192.168.4.102/32 5.118.19.82 1 1.128 wlan0-2
> ----
>
>
>
>
>
> On Mon, Mar 24, 2014 at 1:47 AM, Ferry Huberts <(spam-protected)> wrote:
>
>> We have not seen this and we use smart gateway heavily.
>> Can you show configs of both nodes?
>>
>>
>> On 24/03/14 02:39, Ben West wrote:
>>
>>> I have seen sporadic instances of certain repeater nodes' (not all,
>>> generally a small subset of all repeater nodes in a given mesh), break
>>> their route through the gateway node if the gateway node reboots while
>>> the repeater does not.
>>>
>>> That is, the gateway node reboots, and the affected repeater node
>>> thereafter appears to correctly re-establish its route thru the gateway,
>>> but the gateway doesn't actually route the repeater's traffic. From the
>>> affected node, I can ping the gateway's mesh IP and also the gateway's
>>> WAN IP, but I can't ping anything beyond the gateway node's WAN
>>> interface.
>>>
>>> Restarting olsrd on the repeater node seems to resolve this problem
>>> consistently.
>>>
>>> This is occurring on nodes running OpenWRT AA r39154 and OLSRd v6.5-4,
>>> using SmartGateway. I'm quoting my /etc/config/olsrd below, used on all
>>> notes alike.
>>>
>>> Has anyone else observed a similar problem? Browsing the changelog at
>>> http://olsr.org/git/ since v6.5-4 doesn't show any mention of explicit
>>> SmartGateway bugfixes, just additional features.
>>>
>>> -----
>>> config olsrd
>>> # uncomment the following line to use a custom config file instead:
>>> #option config_file '/etc/olsrd.conf'
>>>
>>> option 'IpVersion' '4'
>>> option 'LinkQualityLevel' '2'
>>> option 'LinkQualityAlgorithm' 'etx_ffeth'
>>> option 'SmartGateway' 'yes'
>>> option 'Pollrate' '0.1'
>>> option 'TcRedundancy' '2'
>>> option 'MprCoverage' '5'
>>>
>>> config 'LoadPlugin'
>>> option 'library' 'olsrd_arprefresh.so.0.1'
>>>
>>> config 'LoadPlugin'
>>> option 'library' 'olsrd_dyn_gw.so.0.5'
>>>
>>> config 'LoadPlugin'
>>> option 'library' 'olsrd_dyn_gw_plain.so.0.4'
>>>
>>> config 'LoadPlugin'
>>> option 'library' 'olsrd_nameservice.so.0.3'
>>> #option 'resolv_file' '/tmp/resolv.conf.auto'
>>> option 'sighup_pid_file' '/var/run/dnsmasq.pid'
>>> option 'suffix' '.mesh'
>>>
>>> config 'LoadPlugin'
>>> option 'library' 'olsrd_txtinfo.so.0.1'
>>> option 'accept' '0.0.0.0'
>>>
>>> config 'Interface'
>>> list 'interface' 'mesh'
>>> option 'Ip4Broadcast' '255.255.255.255'
>>> option 'Mode' 'mesh'
>>> #
>>>
>>>
>>> --
>>> Ben West
>>> http://gowasabi.net
>>> (spam-protected) <mailto:(spam-protected)>
>>> 314-246-9434
>>>
>>>
>>>
>> --
>> Ferry Huberts
>>
>
>
>
> --
> Ben West
> http://gowasabi.net
> (spam-protected)
> 314-246-9434
>
--
Ben West
http://gowasabi.net
(spam-protected)
314-246-9434
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.olsr.org/pipermail/olsr-users/attachments/20140324/9811173d/attachment.html>
More information about the Olsr-users
mailing list