[olsr-dev] Changes and some bug fixes to olsrd

Erik Tromp (spam-protected)
Mon Nov 27 13:48:25 CET 2006


Dear Maintainers of the OLSRD code,

I would like to propose a few fixes and enhancements to the current OLSR code. The proposed changes are the diff file which can be
downloaded from http://home.tiscali.nl/levab001/olsrd-0.4.10.diff. The diff's are with respect to the current version 0.4.10, as can
be downloaded at: http://www.olsr.org/releases/0.4/olsrd-0.4.10.tar.gz

Just to be sure you're doing the same as I did, here is the patch procedure. Unpack the downloaded .tar.gz file, e.g.

  tar -xvzf olsrd-0.4.10.tar.gz

A directory olsrd-0.4.10 will be created. cd to that directory. Copy the olsrd-0.4.10.diff file as attached to this mail into that
directory. Then type:

  patch -p 1 < olsrd-0.4.10.diff

You should see a lot of lines starting with "patching file ...". That will conclude the patch process.

Below is a list of the changes I propose; some rather trivial, but some very 'challenging' ;-) All changes have been tested, but of
course, there may always be stuff that I missed.

Have fun!

Erik


* Added comment about 'Weight' and 'LinkQualityMult' parameters in
  sample .conf files.


* httpinfo plugin:
  - make IP-addresses clickable
  - Show ETX value of a HNA route
  - Show number of MID aliases for each MID entry
  - Print LQ information only if LQ enabled


* Fix double definition MAX_IFS (16) and MAXIFS (8)


* Increased MAX_IFS to 32


* Fix copying of old instead of new main IP address in chk_if_changed(...) function


* Fix link-set -> neighbor-set mismatch that occurs when a neighbor node changes its main IP address.


* Fixed an inconsistency in the LQ MPR calculation, where a node would be selected
  as MPR to a 2-hop neighbor even if the 2-hop neigbor could be reached in
  1 hop at a lower ETX.

  Explanation:
  Suppose we have 4 nodes, A, B, C and D, in the following topology:

        1   1
      A ----- B
   0.5| \0.5  | 1
      |  \    |
      |   \   |
      |    \  |
   0.5|  0.5\ | 1
      C ----- D
       0.5  0.5

  The number on each side of the link is the link quality value as observed
  by the corresponding node. mpr_coverage is 2.

  Currently, the function olsr_calculate_lq_mpr in node A does the following
  (follow along in the source code):

  // loop through all 2-hop neighbours
  --> the only 2-hop neighbor is D

  // check whether this 2-hop neighbour is also a neighbour
  --> Yes, at quality 0.5 * 0.5 = 1/4. This value is stored in 'best'.

  // see wether we find a better route via an MPR
  --> Yes, the 2-hop path A-B-D has path_link_quality 1 * 1 = 1

  k is set to 0

  // look for the best 1-hop neighbour that we haven't
  // yet selected
  --> 'best' is set to -1.0, and the search results in neighbor B,
      'best' being 1. B is selected as MPR.

  k goes to 1

  // look for the best 1-hop neighbour that we haven't
  // yet selected
  --> 'best' is set to -1.0, and the search results in neighbor C,
      'best' being 1/16. C is also selected as MPR, even through the
      path link quality is lower than the quality of the 1-hop link
      A - D, which is 1/4 .

  The inconsistency is: if path A - B - D would not be there at all,
  the LQ-MPR calculation would see only A - D as the 1-hop path and
  never even consider C an MPR in the 2-hop path A - C - D. Now we add
  a better 2-hop path A - B - D, and suddenly node C becomes MPR in
  the 2-hop path A - C - D.


* Fix a bug in link-quality handling of hello messages inside
  olsr_process_message_neighbors. This bug caused continuous flapping
  of MPR selection in certain situations.

  Suppose we have 3 nodes, A, B, C in the following topology:

                1 ------ 1
        1   1   /        \
      A ----- B            C
                \        /
              0.5 ------ 0.5

  The number on each side of the link is the link quality value as observed
  by the corresponding node.

  B sends a LQ hello message indicating that has a neighbor C at LQ=1 and NLQ=1,
  and it has neighbor C at LQ=0.5 and NLQ=0.5 .

  A receives the LQ hello message by B. Currently, the olsr_process_message_neighbors
  function in node A does the following (follow along in the source code):

  for(message_neighbors = message->neighbors;
      message_neighbors != NULL;
      message_neighbors = message_neighbors->next)
  --> Let's assume the neighbor message "B has C at LQ=1 and NLQ=1" is processed first

            if (olsr_cnf->lq_level > 0)
  --> yes, we're doing LQ OLSR

              link = get_best_link_to_neighbor(&neighbor->neighbor_main_addr);
  --> there is only one link, A - B, link->loss_link_quality = 1 and
      link->neigh_link_quality = 1.

                  // have we found the one-hop neighbor that sent the
                  // HELLO message that we're current processing?

                  if (walker->neighbor == neighbor)
  --> ok, found 2-hop neighbor C via 1-hop neighbor B

                      walker->second_hop_link_quality =
                        message_neighbors->link_quality *
                        message_neighbors->neigh_link_quality;
  --> walker->second_hop_link_quality becomes 1 * 1 = 1.

                      walker->path_link_quality =
                        walker->second_hop_link_quality *
                        link->loss_link_quality * link->neigh_link_quality;
  --> walker->path_link_quality becomes 1 * 1 * 1 = 1

  for(message_neighbors = message->neighbors;
      message_neighbors != NULL;
      message_neighbors = message_neighbors->next)
  --> Now the neighbor message "B has C at LQ=0.5 and NLQ=0.5" is processed.

            if (olsr_cnf->lq_level > 0)
  --> yes, we're doing LQ OLSR

              link = get_best_link_to_neighbor(&neighbor->neighbor_main_addr);
  --> there is only one link, A - B, link->loss_link_quality = 1 and
      link->neigh_link_quality = 1.

                  // have we found the one-hop neighbor that sent the
                  // HELLO message that we're current processing?

                  if (walker->neighbor == neighbor)
  --> ok, found 2-hop neighbor C via 1-hop neighbor B

                      walker->second_hop_link_quality =
                        message_neighbors->link_quality *
                        message_neighbors->neigh_link_quality;
  --> walker->second_hop_link_quality becomes 0.5 * 0.5 = 0.25.

                      walker->path_link_quality =
                        walker->second_hop_link_quality *
                        link->loss_link_quality * link->neigh_link_quality;
  --> walker->path_link_quality becomes 0.25 * 1 * 1 = 0.25


  So: the better walker->path_link_quality of 1 is now overwritten by the worse
  walker->path_link_quality of 0.25 simply because the worst neighbor message
  by B was transmitted last. If B would have put its neighbor messages the other
  way round into its LQ hello message, the better walker->path_link_quality
  would have 'survived' instead.

  In other words, depending on the order of neighbor messages in a LQ-hello
  message, 2-hop path link qualities may change, even though nothing really
  changes in the network.

  This is the cause for a lot of 'route/MPR flapping' in the OLSR network, which
  we experienced during testing.


* Fix duplicate entries of aliases in MID entries.


* Remove no longer declared aliases from MID entries.


* Moved annoying printing of duplicate tables down to debug level 8.


* Fix: when link quality routing is enabled, HNA route calculation should be based on
  link quality, not on hopcount.


* Prevent two nodes that are advertising the same HNA from processing each
  other's HNA ending up in the two nodes routing the same (sub)net to each
  other in deadlock.
  To this end, added functions find_local_hna4_entry(...) and
  find_local_hna6_entry(...)





More information about the Olsr-dev mailing list