[Olsr-users] Managing Local Default Route offered as HNA 0.0.0.0 when healthy versus OLSR learned default route from mesh

Eric Malkowski (spam-protected)
Fri Sep 19 17:24:02 CEST 2008


Hi all-

I wanted to run an idea by everyone -- may have already been discussed 
in the past, but couldn't find anything, here goes:

I've got a box with wired enet as internet (DHCP or static IP w/ fixed 
default route not added by OLSR) and 2.4 ghz for people to get on and 
5.8 ghz ad-hoc to rest of mesh - so the typical 3 interfaces one wired, 
two wireless AP and backhaul.

With OLSR dyn_gw plugin, if the wired internet fails the pings, OLSR 
will install a default route from the mesh yielding 2 default routes.
Problem is the wired default route is favored over the OLSR installed 
default route (zero metric versus non-zero olsr default route metric), 
so people on the 2.4 ghz of this same box don't get internet access 
because they keep trying to take the wired internet rather than the 
better default route from the mesh.  If other nodes on the mesh were 
using this box that lost it's internet, they're OK because they'll seek 
some other node w/ internet HNA -- i.e. nodes in the mesh w/ no internet 
backhaul (empty eth0 connection) just pick whoever has internet to offer 
from nodes running dyn_gw plugin.

My plan to fix this - curious if you guys think it will work:

- Switch to dyn_gw_plain plugin
- Use "ip route and ip rule" commands to add a 2nd routing table with 
default route out eth0 internet connection.  ip rule will add a rule for 
traffic that uses internet eth0 connection source IP to always takes the 
wired internet connection via this extra routing table.  Other traffic 
is subject to normal "main" routing table
- The "main" routing table will initially have no default route -- thus 
dyn_gw_plain will not inject in the mesh any default route to "offer" 
other nodes, but rather OLSR will learn a default route from the mesh if 
one is available -- that's fine.
- Setup a script that uses ping of known internet IPs like normal dyn_gw 
plugin and it uses the source IP of the wired internet eth0 connection 
so the pings always use the extra routing table since ip rule directs 
the traffic there.  When the pings see the internet is good, it can do 
"route add default gw x.x.x.x" so the main routing table gets a non-olsr 
default route when local internet connection is determined "good".
- dyn_gw_plain will "notice" the non-olsr default route and add it as an 
HNA 0.0.0.0 for the mesh and use the default route - if a OLSR installed 
default route to another node in the mesh that was in use, it should go 
away in favor of the local internet route -- I've seen this behavior 
with dyn_gw, but haven't experimented with dyn_gw_plain yet -- expecting 
that part is the same.
- The separate ping script continues in the background (it's the only 
thing using the extra routing table).  If it determines the gateway to 
be "dead" by failing to get responses, he can simply delete the non-olsr 
default route he added and thus dyn_gw_plain will "notice" once again 
the gateway is dead, stop HNA 0.0.0.0 and eventually learn a peer node 
in the mesh default route to use until the local wired connection is 
determined good once again by the ping script.
- The above cycle goes on forever.
- This would allow people on the 2.4 ghz ath0 network to take the local 
internet connection when it's up, and switch in a reasonable amount of 
time if the local internet dies and OLSR learns a default route from the 
mesh.  It's perfectly acceptable for an in-progress download to die 
during the switch -- it's light surfing going on on this setup.  If the 
box is acting as local DNS server (w/ BIND in my case), name lookups 
will take the internet local eth0 when it's good, or take the mesh 
default route when local internet is bad -- it solves name resolution 
problems so they go either direction based on active default route.  The 
ping script doesn't need to resolve names - like the dyn_gw plugin, I 
plan to use just IP addresses of known internet sites that should be 
responsive when local internet connection is healthy

I've used this type of strategy for multiple ISP connections i.e. my 
work has a Soekris net5501 that is our firewall / gateway etc. and we've 
got a business Cable internet connection that does 24 megabit burst, and 
then a T1 at 1 megabit phone/data setup.  I've got multiple routing 
tables and pings so if the cable goes down, we switch to T1 default 
route in the main routing table.  If the Cable comes backup, we "favor" 
the cable interface since it's faster.  When traffic goes out either 
connection, the traffic is NATed etc.  I've found it's best to do all 
data out Cable if available and T1 only if cable is down rather than 
equal cost multipath on both since when the T1 is picked, that 
particular flow gets slow speed and people who get the cable at the 
moment they go out, it's very fast.  It's 100% one or the other based on 
link status instead of multip-path.  It works great.   I'm trying to do 
something similar w/ OLSR and local default gateway monitoring to 
automate as internet backhauls go up and down.  It also has the benefit 
where if we ant to force a node to stop using it's local backhaul, we 
can simply unplug the link and wait for OLSR to rip out the HNA 0.0.0.0 
and pick up a default route from the mesh.

Any thoughts?  I'd be interested in hearing about anyone else's 
solutions for this type of goal.

If people really like what I've come up with, I'll be happy to post my 
scripts and such for others to enjoy.  I'm planning on testing this 
weekend unless someone tells me of a fundamental problem with my 
strategy described above.

Thanks.

-Eric





More information about the Olsr-users mailing list