[Olsr-dev] Triggered LQ Messages

Tue Feb 12 17:35:18 CET 2008

Markus Kittenberger wrote:
>     you are observing artifacts of a broken implementation, any olsr
>     neighbor
> 
> 
> maybe, but 
> my sample can be reproduced between 2 routers with fff 1.6.25, with 
> default settings,..

this bug is unrelated to the NLQ=0 bug that was unveiled in late december.
i.e. its is unrelated to scaling. link sensing is just broken in
its asymmetrical implementation. the way it should be implemented is
that each hello time interval a check "did i get at least a single
hello from my neighbor" is done, if thats ok the LQ is bumped up,
if the neighbor is silent then LQ is squashed down ion step.

>     (and its corresponding tc_edge) must be removed after the neighbor
>     holdtime expires, 200s sounds really broken.
> 
> 
> just tested:
> 16:15:45 LQ 1.0 NLQ 1.0 ETX 1.0 (unplugging)
> 16:16:45 LQ 0.97 NLQ 1.00 ETX 1.03
> 16:18:00 ETX 1.08
> 16:18:30 no neighbour
> 
> so we`ve about 135 .. 165 seconds in fff 1.6.25,

agreed, thats broken ...

> Where is the neighbour hold time specified?

crawl through link_set.c

> do you mean HelloValidityTime?
> 
>     IMO it would be a better start to fix hold-time detection, such that
>     it behaves symmetrical. afaik henning has an experimental patch that
>     removes the whole windowing as is today by a simple exponential backoff
>     formula.
> 
> 
> this sound fine/interesting, but i think it should be considered what olsr 
> messages link sensing counts,.. 
> because at the moment links with much olsr traffic behave quite 
> different than links with less traffic
> 
> i just (re)tested a router with only 1 neighbour, a dead end in fact
> on the other side is a small node with about 4 neighbours
> 
> when disabling the link for 40 
> seconds,, the lq on my router drops from 1.0 to 0.4 (60 lossed packets 
> due to counted olsr traffic from nodes further away in the net)
> while on the other side it only drops to 0.8 (tc interval of 2 , means 
> 20 lossed packets, cause of being a dead end my router produces no more 
> olsr traffic than its own)
> (due to the window, the lower lq values stay stable for 160 seconds (on 
> the neighbour of the dead end router) and 30 seconds (on the dead end 
> router))
> 
> my guess is that all olsr packets of the specific neighbour are used for 
> link sensing,..

only Hello packets are considered.

> and this causes problems, because in dense net areas, this means that lq 
> values can change dramatically fast,..
> 
> so my proposal is to count only TC-LQ messages originated by the neighbour.

hmm this is a bad idea, as there is no good determinism on which interface a
TC message will enter the box. It is the nature of flooding that it may
enter at any interface which has at least a neighbor in the 2way state.

> but this may/will get tricky on calculating the correct amount of lost 
> packets, (in fact the TC-LQ need their own sequence number for this)
> they do have a sequencs number, but what increments it? every (forwared 
> or not) tc-lq?

> p.s. longer periods of completely dead links may have less impact on ETX 
> than short ones!!!!
> examples on the same link as above (each time starting with ETX of 1.0)
> 
> 10 seconds disabled LQ 0.85, NLQ 0,95, ETX 1,25 
> 40 seconds disabled LQ 0.4 , NLQ 0.8, ETX 3,15
> 70 seconds disabled LQ 0.96, NLQ 0.7, ETX 1.48
> last example was too long for the 100 packets window, but afterwards it 
> reported 4 of 100 lost, instead of 0 loast of 5 or so as i expected ???
> 
> on higher frequented links 
> 30 seconds of silence should be enough to cause this effect,..
> 

agreed, pls talk to henning who is rewriting this right now.

/hannes