[Olsr-dev] Triggered LQ Messages

Tue Feb 12 14:20:30 CET 2008

Hi

What are the situation that trigger an LQ Hello?

A change in neighbourset?

if theese are the only reasons i think they are of no practical use,..
(pls correct me if this is wrong (-;)

here are the reasons for above statement:

when losing connection to a neighbour, olsr usually needs 200 seconds of a
dead link before "accepting it"
so there is really no urgency anymore to trigger an lq-hello,..

when getting an new neighbour, its lq is low at the beginning, so no need to
be proud of, (no need for an immediate lq-hello)

while seing no benefit of theese messages i dislike following:

as far as i kkonw, the logic producing the lq value, puts every hello in its
window
and assumes a loss every 3*lq_hello interval,..
(which i thin kis far too slow)

but in facts it may receive 100 triggered hellos in the same time period,..
(a lq hello frequency of about 5Hz is not unusual near the uplink in vienna)

an this resulta in very unuseable lq values.

take this sample on a link over a 5Ghz Bridge (having some trouble with
trees and wind or similar, or whatever)

12:00:00 lq 0.00 - 0 of  0 lost (bridge associates)
12:00:20 lq 0.95 - 5 of 100 lost (already 100 lq hellos transmitted)
12:01:00 lq 0.96 - 4 of 100 lost (bridge loses association)
12:01:20 lq 0.95 - 5 of 100 lost (slowly assuming losses)
12:01:21 lq 0.00 - 94 of 100 lost (first packet after link association had a
quite high sequence number, resulting in adding 90 losses)
12:01:45 lq 0.96 - 4 of 100 lost (everything is now like before the bridge
lost association)

look fine,..
but the results are in fact horriffic (dont get angry on the bad
jokes/thoughts of the imaginary user)

assuming the default route going over the bridge at the beginning
folowing happens to me (the imaginary user (-;):

-- 

everything is fine and fast, im'surfing around, until the bridge loses
association (damn shit, this f*** happens again)
i sit here an wait for olsrd to take another route (knowing this will take
unbelieveable 3 minutes)

but the bridge comes up again only 20 sec later, hurray!
now i can immediatley continue surfing,..

but no!
now olsr decides to switch the default route (i think maybe it would be
better to have no alternative routes at all)
but this route isn*t working instantly, so i wait while the information is
propagated in the net,..
(knowing that this alls is pure nonsense because the bridge is up again)

20 seconds later olsr decides the route over the bridge is better (in fact
it is, stupid olsr!!!)
it switches again, but the wrong link is now propagated in the net,..
 (causing som temporary loops or so somewhere)

so i wait another 20
seconds, (thinking about the internet cafe on the other side of the street,..)

---
or in numbers, in such a scenario a outage of 10 seconds of a link can
result in an ETX change from 1 to 4 (assuming 5 LQ-Hellos per
second), causing the loss of full connectivity for about one minute,..

without so much triggered lq-hellos, chances would be high that after 10
seconds the ETX would just rise from 1 to 1.05, having no side effects, so
that there`s is just a 10 secong connectivity loss,..

if other nodes use this node as their main gateway, this unlucky behaviour
may affect large regions of a mesh

---
to come to a proposal,..

what about stoppping triggered lq-hellos completely, or at least stop
counting them like normal hellos,..

Markus

p.s. maybe some other events are better situated to trigger olsr messages
(with high TTL), like a switch of the default route
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.olsr.org/pipermail/olsr-dev/attachments/20080212/9fc93934/attachment.html>