[Olsr-users] Managing Local Default Route offered as HNA 0.0.0.0 when healthy versus OLSR learned default route from mesh
Fri Sep 19 17:24:02 CEST 2008
I wanted to run an idea by everyone -- may have already been discussed
in the past, but couldn't find anything, here goes:
I've got a box with wired enet as internet (DHCP or static IP w/ fixed
default route not added by OLSR) and 2.4 ghz for people to get on and
5.8 ghz ad-hoc to rest of mesh - so the typical 3 interfaces one wired,
two wireless AP and backhaul.
With OLSR dyn_gw plugin, if the wired internet fails the pings, OLSR
will install a default route from the mesh yielding 2 default routes.
Problem is the wired default route is favored over the OLSR installed
default route (zero metric versus non-zero olsr default route metric),
so people on the 2.4 ghz of this same box don't get internet access
because they keep trying to take the wired internet rather than the
better default route from the mesh. If other nodes on the mesh were
using this box that lost it's internet, they're OK because they'll seek
some other node w/ internet HNA -- i.e. nodes in the mesh w/ no internet
backhaul (empty eth0 connection) just pick whoever has internet to offer
from nodes running dyn_gw plugin.
My plan to fix this - curious if you guys think it will work:
- Switch to dyn_gw_plain plugin
- Use "ip route and ip rule" commands to add a 2nd routing table with
default route out eth0 internet connection. ip rule will add a rule for
traffic that uses internet eth0 connection source IP to always takes the
wired internet connection via this extra routing table. Other traffic
is subject to normal "main" routing table
- The "main" routing table will initially have no default route -- thus
dyn_gw_plain will not inject in the mesh any default route to "offer"
other nodes, but rather OLSR will learn a default route from the mesh if
one is available -- that's fine.
- Setup a script that uses ping of known internet IPs like normal dyn_gw
plugin and it uses the source IP of the wired internet eth0 connection
so the pings always use the extra routing table since ip rule directs
the traffic there. When the pings see the internet is good, it can do
"route add default gw x.x.x.x" so the main routing table gets a non-olsr
default route when local internet connection is determined "good".
- dyn_gw_plain will "notice" the non-olsr default route and add it as an
HNA 0.0.0.0 for the mesh and use the default route - if a OLSR installed
default route to another node in the mesh that was in use, it should go
away in favor of the local internet route -- I've seen this behavior
with dyn_gw, but haven't experimented with dyn_gw_plain yet -- expecting
that part is the same.
- The separate ping script continues in the background (it's the only
thing using the extra routing table). If it determines the gateway to
be "dead" by failing to get responses, he can simply delete the non-olsr
default route he added and thus dyn_gw_plain will "notice" once again
the gateway is dead, stop HNA 0.0.0.0 and eventually learn a peer node
in the mesh default route to use until the local wired connection is
determined good once again by the ping script.
- The above cycle goes on forever.
- This would allow people on the 2.4 ghz ath0 network to take the local
internet connection when it's up, and switch in a reasonable amount of
time if the local internet dies and OLSR learns a default route from the
mesh. It's perfectly acceptable for an in-progress download to die
during the switch -- it's light surfing going on on this setup. If the
box is acting as local DNS server (w/ BIND in my case), name lookups
will take the internet local eth0 when it's good, or take the mesh
default route when local internet is bad -- it solves name resolution
problems so they go either direction based on active default route. The
ping script doesn't need to resolve names - like the dyn_gw plugin, I
plan to use just IP addresses of known internet sites that should be
responsive when local internet connection is healthy
I've used this type of strategy for multiple ISP connections i.e. my
work has a Soekris net5501 that is our firewall / gateway etc. and we've
got a business Cable internet connection that does 24 megabit burst, and
then a T1 at 1 megabit phone/data setup. I've got multiple routing
tables and pings so if the cable goes down, we switch to T1 default
route in the main routing table. If the Cable comes backup, we "favor"
the cable interface since it's faster. When traffic goes out either
connection, the traffic is NATed etc. I've found it's best to do all
data out Cable if available and T1 only if cable is down rather than
equal cost multipath on both since when the T1 is picked, that
particular flow gets slow speed and people who get the cable at the
moment they go out, it's very fast. It's 100% one or the other based on
link status instead of multip-path. It works great. I'm trying to do
something similar w/ OLSR and local default gateway monitoring to
automate as internet backhauls go up and down. It also has the benefit
where if we ant to force a node to stop using it's local backhaul, we
can simply unplug the link and wait for OLSR to rip out the HNA 0.0.0.0
and pick up a default route from the mesh.
Any thoughts? I'd be interested in hearing about anyone else's
solutions for this type of goal.
If people really like what I've come up with, I'll be happy to post my
scripts and such for others to enjoy. I'm planning on testing this
weekend unless someone tells me of a fundamental problem with my
strategy described above.
More information about the Olsr-users