[olsr-dev] Improvements in the algorithm

Wed Nov 16 07:19:51 CET 2005

It's good to see this sort of discussion on this list.  I've been doing some
thinking myself about this.

Basically the problem with OLSR is that it is a flat routing protocol.  That
is, it is not hierarchical.  In a traditional wired network, hierarchy can
be seen in the CIDR subnetting of IP address space.  
Routing in a wired network is achieved like so:
Packet destination address is 10.10.100.1 - the local router has a route for
10.10.0.0/16 and so sends it towards that /16 network.  The next router has
a route for 10.10.96.0/20 and so sends the packet towards that particular
/20 network.  The next router has a route for 10.10.100.0/28 and so sends
the packet to that network - hopefully that network is an Ethernet LAN and
it can then find it's way to 10.10.100.1 by doing an ARP (IP-to-MAC address)
lookup.  With the MAC address, the router on the destination network can
simply send the packet and the switch on the destination network will send
the packet to it's destination host.

There is a problem though.  If the node with the IP address of 10.10.100.1
moves to another area, it may no longer be within cabled range of the
10.10.100.0/28 network.  So it must change it's IP address.  This is also
true of wireless networks that are subnetted the same way - with each
geographical location covering a different range of IP addresses.  If a node
moves across a geographic subnet border it must change it's IP address.  Any
current data connections must be closed and re-established after the new
address is configured.  Also, because it has a new IP address, other nodes
may now not know where to find it.  Traditional methods of keeping track of
dynamic IP addresses, such as dynamic DNS or mobile IP are basically too
slow to keep track of a fast-roaming node, and they require centralised,
human-managed servers to maintain the identity-to-location lookup tables.

Mesh networks arose to facilitate ad-hoc networking.  They require no
centralised servers or routers, and no planning and configuration of a
network structure.  Protocols like OLSR still require human-managed static
IP addressing, however some groups are attempting to address this problem
with dynamic addressing schemes.  To facilitate mobile roaming within the
mesh, mesh protocols generally group all nodes under the one large IP
netmask.  They must do this because grouping the network into smaller IP
subnets would require the nodes to change IP address as they roamed from one
area to another.  So by not using subnetting, mesh networks are "flat" from
an IP point of view.  OLSR does not use any other method to impose
hierarchy, so networks that use standard OLSR are truly "flat".

The problem arises when the mesh becomes large.  In OLSR, every node always
knows a specific route to every other node.  There is no aggregation of this
routing information, so a network of 150 nodes equals a routing table of 150
entries held by every node.  Also, as onelektra points out, OLSR uses a lot
of control packets to keep every node updated of every other node's status -
Hellos, TCs, HNAs, etc.  When something changes, all these things must be
updated in every node across the mesh.

Some other mesh routing protocols attempt to address this problem.  CGSR,
HSR, ZRP and LANMAR are listed here: 
http://tenet.knu.ac.kr/research/seminar/khpark_030809.ppt
Basically they all localise routing information via one method or another.
Nodes have specific routes to nearby nodes, but have generalised routes to
nodes further away.  They might somehow know that their destination node is
part of a cluster and send the packet towards that cluster - trusting that
once the packet gets there the nodes in that cluster will route the packet
towards the intended node.  Or they might request that a route for a distant
node be created as needed - this is "on demand" routing.

Some of these protocols might be better than OLSR, but I don't know because
I haven't tried them.  I don't know of any widespread community-based
implementations of these protocols and they don't seem to be available for
the WRT54G/OpenWRT.  Because OLSR is so widespread, it is probably better to
try to build hierarchical structure into it than try to switch everybody to
another protocol.  I'll leave it up to smarter heads than mine to work out
the best way to do this, but I will state my preference - I don't like
protocols that elect "Critical Nodes".  Critical Nodes (AKA Clusterheads)
act as traffic coordinators for their local cluster and they receive an
unfair burden of routing overhead.  For a protocol to be truly
fault-tolerant I believe all nodes should be equal - the protocol should be
smart enough not to require a large routing overhead in the first place.

So for OLSR to become scalable, it needs to implement some sort of
hierarchal structure.  And for it to remain viable as a Mobile Ad-hoc
protocol, this structure needs to be self-configuring.  Requiring node
administrators to determine where cluster borders lie and to enter this
information into a configuration file is unfeasible - the protocol needs
algorithms that can work out the optimum hierarchy automatically - easier
said than done I'm sure.

Recently I came across another protocol that I do think could be the future
of mesh routing, and possibly of all Internet routing.  It is called Dynamic
Address Routing (DART).  It has been designed from the ground up to be
globally scalable, yet its essence is startlingly simple.  It recognises the
problem of IP addressing - the fact that it scales well but doesn't handle
mobility.  This is because an IP address is designed to be two things - it
is a node's identity on a network, and it is a marker of a node's location
within the wider network.  

DART separates identity from location.  Each node has two addresses - a
static node identifier (which could be an IP address), and a dynamic routing
address, which changes to reflect the node's location within the overall
mesh topology.  The routing address is a binary number and each "subnet"
contains only two nodes - the 0 and 1 of the last bit in the address.  So
the level of hierarchical organisation is very high - but the number of
entries in each node's routing table is very low.  As each packet travels
from it's source to it's destination, it follows a binary tree - the routing
path is built into the addressing system, so routing entries are needed only
to specify the exceptions to the binary-tree rule.  And because the
addresses are dynamic, any nodes with the "incorrect" address simply have
their address automatically changed to reflect their "correct" location in
the mesh.

DART doesn't yet have any code available, but the authors are promising to
release a Linux implementation soon.  Their poster (see below) shows that
they intend to use Linksys routers and IPAQ PDAs as their testbeds, so they
are obviously serious about making DART work on small-footprint,
limited-resource platforms.

For more information about DART, see these pages:

DART Project Homepage:
http://www.cs.ucr.edu/~jeriksson/dartweb/

DART Poster - a nice easy summary of the project
http://www.cs.ucr.edu/~jeriksson/dartweb/DART_Poster_ICNP_03.pdf

Scalable Ad Hoc Routing: The Case for Dynamic Addressing - the in-depth
paper outlining the specifics of the protocol - presented to INFOCOMM 2004
http://www.cs.ucr.edu/~jeriksson/dartweb/DART_INFOCOM_04.pdf

Cheers,

Dan