<div dir="ltr"><div>The specific failure mode I'm seeing is that, after the gateway node undergoes an unscheduled reboot, the affected repeater node is only able to send out ICMP ping packets beyond the WAN interface of the gateway node. Other packets, e.g. "nslookup <a href="http://somedomain.com" target="_blank">somedomain.com</a>", don't go through.<br>
<br></div><div>Again, this failure mode does not appear to happen consistently. I can't trigger it (easily).<br></div><div><br></div>Since this is failing for commands entered at the shell prompt, e.g. like nslookup, I am assuming it is not a problem with existing TCP sessions becoming stuck or orphaned.<br>
<div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Mar 26, 2014 at 1:30 PM, Ferry Huberts <span dir="ltr"><<a href="mailto:mailings@hupie.com" target="_blank">mailings@hupie.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><br>
<br>
On 26/03/14 17:32, Ben West wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
W/r/t to situations where a gateway node undergoes an uncommanded<br>
reboot, e.g. power fails briefly, is there a preferred approach for<br>
ensuring all repeater nodes w/in that mesh have their NAT states<br>
refreshed as needed, when SmartGateway is in use?<br>
<br>
For example, if the repeater nodes detect their selected gateway node<br>
rebooting (i.e. becoming unavailable for a few minutes), or even a new<br>
gateway node coming online, should they restart their local instance if<br>
olsrd in consequence?<br>
<br>
Or, would the best approach perhaps be to upgrade to olsrd 0.6.6, using<br>
the same config? I do see these entries in Changelog that might be useful:<br>
<br>
kernel_route: olsr_os_inetgw_tunnel_route can now take the table<br>
gateway: let the gateway code determine the tunnel name<br>
gateway: remove the worst gateway before adding new one<br>
gateway: add SmartGatewayUseCount configuration parameter<br>
gateway: use SmartGatewayUseCount setting the the gateway lists<br>
gateway: add SmartGatewayEgressInterfaces configuration parameter<br>
gateway: add SmartGatewayMarkOffset{Egress,<u></u>Tunnels} configuration<br>
parameters<br>
gateway: add SmartGatewayPolicyRoutingScrip<u></u>t configuration parameter<br>
gateway: initialise a set of fixed tunnel names in/for multi-gateway mode<br>
gateway: initialise the egress interface names in/for multi-gateway mode<br>
gateway: use fixed tunnel names in/for multi-gateway mode<br>
gateway: setup and clear table specific default routes in/for<br>
multi-gateway mode<br>
gateway: setup/cleanup multi-gateway mode during startup/shutdown of olsrd<br>
gateway: introduce and use MULTI_GW_MODE define<br>
gateway: enable multi-gateway mode<br>
<br>
Besides that, I have now deployed this param to all nodes, and disabled<br>
dyn_gw_plain plugin. However, it looks like a few repeater nodes (not<br>
all, mysteriously) still see their route to the WAN, beyond the gateway<br>
node, break when the gateway node reboots.<br>
<br>
</blockquote>
<br></div></div>
what is 'break' ?<br>
<br>
When the gateway reboots obviously the traffic to the WAN can't proceed.<br>
It now depends on the application that initiated the connection on what happens during the time between the brokenness and the choosing of a new gateway (at least 1 minute in your version).<br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>
LoadPlugin "olsrd_dyn_gw.so.0.5"<br>
{<br>
PlParam "HNA" "0.0.0.0 0.0.0.0"<br>
}<br>
<br>
<br>
<br>
<br>
<br>
On Mon, Mar 24, 2014 at 2:55 PM, Teco Boot <<a href="mailto:teco@inf-net.nl" target="_blank">teco@inf-net.nl</a><br></div><div>
<mailto:<a href="mailto:teco@inf-net.nl" target="_blank">teco@inf-net.nl</a>>> wrote:<br>
<br>
Is *the traffic* from same connection?<br>
If so, the NAT state is gone after a reboot and connection shall be<br>
restarted. Smart gateway cannot fix all problems.<br>
<br>
Teco<br>
<br>
<br>
Op 24 mrt. 2014, om 19:14 heeft Ben West <<a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a><br></div>
<mailto:<a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a>>> het volgende geschreven:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>
Ferry pointed this out off-list. I've since removed dyn_gw_plain<br>
on the nodes where I was testing, and am trying to see if the<br>
problem can be repeated.<br>
<br>
<br>
On Mon, Mar 24, 2014 at 1:11 PM, Teco Boot <<a href="mailto:teco@inf-net.nl" target="_blank">teco@inf-net.nl</a><br></div><div>
<mailto:<a href="mailto:teco@inf-net.nl" target="_blank">teco@inf-net.nl</a>>> wrote:<br>
<br>
On original posting: why using both dyn_gw and dyn_gw_plain?<br>
<br>
Teco<br>
<br>
<br>
Op 24 mrt. 2014, om 02:39 heeft Ben West <<a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a><br></div>
<mailto:<a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a>>> het volgende geschreven:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>
I have seen sporadic instances of certain repeater nodes'<br>
(not all, generally a small subset of all repeater nodes in a<br>
given mesh), break their route through the gateway node if<br>
the gateway node reboots while the repeater does not.<br>
<br>
That is, the gateway node reboots, and the affected repeater<br>
node thereafter appears to correctly re-establish its route<br>
thru the gateway, but the gateway doesn't actually route the<br>
repeater's traffic. From the affected node, I can ping the<br>
gateway's mesh IP and also the gateway's WAN IP, but I can't<br>
ping anything beyond the gateway node's WAN interface.<br>
<br>
Restarting olsrd on the repeater node seems to resolve this<br>
problem consistently.<br>
<br>
This is occurring on nodes running OpenWRT AA r39154 and<br>
OLSRd v6.5-4, using SmartGateway. I'm quoting my<br>
/etc/config/olsrd below, used on all notes alike.<br>
<br>
Has anyone else observed a similar problem? Browsing the<br>
changelog at <a href="http://olsr.org/git/" target="_blank">http://olsr.org/git/</a> since v6.5-4 doesn't show<br>
any mention of explicit SmartGateway bugfixes, just<br>
additional features.<br>
<br>
-----<br>
config olsrd<br>
# uncomment the following line to use a custom config<br>
file instead:<br>
#option config_file '/etc/olsrd.conf'<br>
<br>
option 'IpVersion' '4'<br>
option 'LinkQualityLevel' '2'<br>
option 'LinkQualityAlgorithm' 'etx_ffeth'<br>
option 'SmartGateway' 'yes'<br>
option 'Pollrate' '0.1'<br>
option 'TcRedundancy' '2'<br>
option 'MprCoverage' '5'<br>
<br>
config 'LoadPlugin'<br>
option 'library' 'olsrd_arprefresh.so.0.1'<br>
<br>
config 'LoadPlugin'<br>
option 'library' 'olsrd_dyn_gw.so.0.5'<br>
<br>
config 'LoadPlugin'<br>
option 'library' 'olsrd_dyn_gw_plain.so.0.4'<br>
<br>
config 'LoadPlugin'<br>
option 'library' 'olsrd_nameservice.so.0.3'<br>
#option 'resolv_file' '/tmp/resolv.conf.auto'<br>
option 'sighup_pid_file' '/var/run/dnsmasq.pid'<br>
option 'suffix' '.mesh'<br>
<br>
config 'LoadPlugin'<br>
option 'library' 'olsrd_txtinfo.so.0.1'<br>
option 'accept' '0.0.0.0'<br>
<br>
config 'Interface'<br>
list 'interface' 'mesh'<br>
option 'Ip4Broadcast' '255.255.255.255'<br>
option 'Mode' 'mesh'<br>
#<br>
<br>
<br>
--<br>
Ben West<br></div></div>
<a href="http://gowasabi.net" target="_blank">http://gowasabi.net</a> <<a href="http://gowasabi.net/" target="_blank">http://gowasabi.net/</a>><br>
<a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a> <mailto:<a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a>><br>
<a href="tel:314-246-9434" value="+13142469434" target="_blank">314-246-9434</a> <tel:<a href="tel:314-246-9434" value="+13142469434" target="_blank">314-246-9434</a>><br>
--<br>
Olsr-users mailing list<br>
<a href="mailto:Olsr-users@lists.olsr.org" target="_blank">Olsr-users@lists.olsr.org</a> <mailto:<a href="mailto:Olsr-users@lists.olsr.org" target="_blank">Olsr-users@lists.olsr.<u></u>org</a>><br>
<a href="https://lists.olsr.org/mailman/listinfo/olsr-users" target="_blank">https://lists.olsr.org/<u></u>mailman/listinfo/olsr-users</a><br>
</blockquote>
<br>
<br>
<br>
<br>
--<br>
Ben West<br>
<a href="http://gowasabi.net" target="_blank">http://gowasabi.net</a> <<a href="http://gowasabi.net/" target="_blank">http://gowasabi.net/</a>><br>
<a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a> <mailto:<a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a>><br>
<a href="tel:314-246-9434" value="+13142469434" target="_blank">314-246-9434</a> <tel:<a href="tel:314-246-9434" value="+13142469434" target="_blank">314-246-9434</a>><br>
</blockquote><div>
<br>
<br>
<br>
<br>
--<br>
Ben West<br>
<a href="http://gowasabi.net" target="_blank">http://gowasabi.net</a><br>
</div><div><a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a> <mailto:<a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a>><br>
<a href="tel:314-246-9434" value="+13142469434" target="_blank">314-246-9434</a> <tel:<a href="tel:314-246-9434" value="+13142469434" target="_blank">314-246-9434</a>><br>
<br>
<br>
</div></blockquote><span><font color="#888888">
<br>
-- <br>
Ferry Huberts<br>
</font></span></blockquote></div><br><br clear="all"><br>-- <br>Ben West<div><a href="http://gowasabi.net" target="_blank">http://gowasabi.net</a><br><a href="mailto:ben@gowasabi.net" target="_blank">ben@gowasabi.net</a><br>
314-246-9434<br></div>
</div></div>