[Olsr-dev] Reliability problems using the OLSR2 mesh
Peter Emanuel
(spam-protected)
Fri Jul 29 01:55:32 CEST 2016
I am really having difficulty getting my mesh network to operate reliably
and would appreciate any help to diagnose what is going on. Here is my test
scenario:
5 Raspberry PI¡¯s running Linux strung together line-of-site using a USB
WiFi on the adhoc network
One PI acts as the gateway to the Internet (10.100.18.4)
========== ======== ========== ======== =========
10.100.18.4 -> 10.100.18.5 -> 10.100.18.6 -> 10.100.18.7 -> 10.100.18.8
========== ========= ========== ========= =========
¡¬
V
Eth0 (Internet gateway)
==========
192.168.1.100
==========
I have 2 Android devices ¨C 10.100.18.106 and 10.100.18.202 that join the
adhoc network.
Everything seems to set up nicely. When I am in my home all of the nodes are
within WiFi range of the gateway, they will register with 10.100.18.4 as the
default gateway
E.g ¡°ip route¡± for 10.100.18.5 is below (replace src 5 with 6, 7 and 8 for
the other nodes)
default via 10.100.18.4 dev wlan0 proto 100 src 10.100.18.5 metric 2
onlink
10.100.18.0/24 dev wlan0 proto kernel scope link src 10.100.18.5 metric
9
10.100.18.4 via 10.100.18.4 dev wlan0 proto 100 src 10.100.18.5 metric 2
onlink
10.100.18.6 via 10.100.18.4 dev wlan0 proto 100 src 10.100.18.5 metric 2
onlink
10.100.18.7 via 10.100.18.4 dev wlan0 proto 100 src 10.100.18.5 metric 2
onlink
10.100.18.8 via 10.100.18.8 dev wlan0 proto 100 src 10.100.18.5 metric 2
onlink
10.100.18.106 via 10.100.18.106 dev wlan0 proto 100 src 10.100.18.5
metric 2 onlink
10.100.18.202 via 10.100.18.202 dev wlan0 proto 100 src 10.100.18.5
metric 2 onlink
The Android nodes will set up in the same way
default via 10.100.18.4 dev wlan0 proto 100 src 10.100.18.106 metric 2
onlink
10.100.18.0/24 dev wlan0 proto kernel scope link src 10.100.18.106
metric 9
10.100.18.4 via 10.100.18.4 dev wlan0 proto 100 src 10.100.18.106 metric
2 onlink
10.100.18.5 via 10.100.18.4 dev wlan0 proto 100 src 10.100.18.106 metric
2 onlink
10.100.18.6 via 10.100.18.4 dev wlan0 proto 100 src 10.100.18.106 metric
2 onlink
10.100.18.7 via 10.100.18.4 dev wlan0 proto 100 src 10.100.18.106 metric
2 onlink
10.100.18.8 via 10.100.18.8 dev wlan0 proto 100 src 10.100.18.106 metric
2 onlink
10.100.18.202 via 10.100.18.202 dev wlan0 proto 100 src
10.100.18.106 metric 2 onlink
The gateway node 10.100.18.4 will set up with a default gateway which is my
home router
default via 192.168.1.1 dev eth0
This looks all great to me. However, Internet packets do not pass through
the 192.168.1.1 gateway unless I set up IP table rules to force the packets
through even in the local home setup.
# forward all traffic coming from wlan0 (that's not destined for the
laptop) to eth0
iptables -A FORWARD ! --dst ${WlanIP} -i wlan0 -o eth0 -j ACCEPT
# forward traffic coming from wlan0 to eth0
iptables -A FORWARD -i wlan0 -o eth0 -j ACCEPT
# forward traffic coming from eth0 to wlan0
iptables -A FORWARD -i eth0 -o wlan0 -j ACCEPT
# setup Network Area Translation (NAT) so that all forwarded traffic to
eth0 appears to be coming from the laptop's IP address
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
I found this nice freifunk traceroute tool that displays the MPR route to
any node and beyond ( <https://downloads.open-mesh.org/fftrace/>
https://downloads.open-mesh.org/fftrace/)
Without the iptables, fftrace just stops at the gateway. I can get to the
192.168.1.1 gateway address via any of the nodes including the Android nodes
but it won¡¯t go through 192.168.1.1 without the iptables.
fftrace 192.168.1.1 inside home:
target-ip: 192.168.1.1 seq: 4 olsr-udp-packets
hop Quality etx lq nlq PingIp
__________________________________________________
10.100.18.6 1
10.100.18.4 2
192.168.1.1
fftrace google.com inside home will do the DNS lookup and then trace as
below
target-ip: 216.58.194.174 seq: 1 olsr-udp-packets
hop Quality etx lq nlq PingIp
__________________________________________________
10.100.18.6 1
10.100.18.4 2
192.168.1.1 3
10.97.0.1 4
96.34.122.10 5
96.34.120.108 6
96.34.2.2 7
96.34.0.0 8
96.34.3.1 9
72.14.220.11 10
216.239.49.168 11
64.233.175.249 12
216.58.194.174
Outside of the home from node 8, fftrace to google.com will look like
target-ip: 216.58.194.174 seq: 1 olsr-udp-packets
hop Quality etx lq nlq PingIp
__________________________________________________
10.100.18.7 1
10.100.18.6 2
10.100.18.5 3
10.100.18.4 4
192.168.1.1 5
10.97.0.1 6
96.34.122.10 7
96.34.120.108 8
96.34.2.2 9
96.34.0.0 10
96.34.3.1 11
72.14.220.11 12
216.239.49.168 13
64.233.175.249 14
216.58.194.174
Now comes the real issue for me and I haven¡¯t been able to figure out why
when I string the nodes outdoors line of sight, why things don¡¯t work as
well as the 1 hop inside the home through the gateway. Some observations on
the outdoor nodes:
fftrace works fine on all nodes
ping works on all nodes with reasonable ping times
I can sit in my home and ssh into all the Linux nodes down the line and run
tests with reasonable performance.
All the nodes line up nicely. Node 5 sets up as shown above.
Node 6 will set up to route through node 5.
default via 10.100.18.5 dev wlan0 proto 100 src 10.100.18.6 metric 2
onlink
10.100.18.0/24 dev wlan0 proto kernel scope link src 10.100.18.6 metric
9
10.100.18.4 via 10.100.18.5 dev wlan0 proto 100 src 10.100.18.6 metric 2
onlink
10.100.18.5 via 10.100.18.5 dev wlan0 proto 100 src 10.100.18.6 metric 2
onlink
10.100.18.7 via 10.100.18.5 dev wlan0 proto 100 src 10.100.18.6 metric 2
onlink
10.100.18.8 via 10.100.18.5 dev wlan0 proto 100 src 10.100.18.6 metric 2
onlink
10.100.18.106 via 10.100.18.106 dev wlan0 proto 100 src 10.100.18.6
metric 2 onlink
10.100.18.202 via 10.100.18.202 dev wlan0 proto 100 src
10.100.18.6 metric 2 onlink
As will node 7 through 6 and through 7, etc
The fftrace on all of the nodes show that packets are chasing down the chain
as expected both for Android and for the Linux PI¡¯s when the nodes are
strung out in the neighborhood.
The reliability issue occurs as soon as the Android device (which is how I
test the Internet) moves more than 2 nodes down the chain. If the Android
device has a default gateway of 10.100.18.4, 10.100.18.5 or even 10.100.18.6
depending on which PI is closest, it all seems to work fine. As soon as I
move further down the line to 10.100.18.7 or 10.100.18.8 as the default
Android gateway, the Internet access becomes very flaky. Te bowser times out
with ¡°No connection¡± and pages sometimes stop displaying. Youtube will
work when there are 2 hops but won¡¯t even connect when more.
Are there any log files that I should look at to determine why. In theory,
it looks like to me it should work. Unfortunately it isn¡¯t reliable. Sigh!
Any guidance would be greatly appreciated.
Peter Emanuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.olsr.org/pipermail/olsr-dev/attachments/20160728/9dce4421/attachment.html>
More information about the Olsr-dev
mailing list