[Olsr-dev] [Olsr-users] pre-Xmas Bug Hunting and other stuff
Bernd Petrovitsch
(spam-protected)
Fri Dec 21 12:37:25 CET 2007
OK, I'm overworked and need holidays. Now in the Cc: the correct mailing
list addresses (and thus a full quote):
On Fre, 2007-12-21 at 12:23 +0100, Bernd Petrovitsch wrote:
> Hi all!
>
> I was quite quiet the last time - mainly because of my day job also
> needs attention (and pre Xmas time in Vienna implies various Xmas
> parties and meeting people for a beer;-).
>
> More serious:
>
> We (at least Hannes Gredler and /me) *thought* that it is a good point
> (and time) for a release (and several others didn't disagree:-).
> Sven-Ola Tücke has some build fixes and cleanups for the Windows in
> some patches which can/should IMHO go in before.
>
> We are also in the process of migrating from CVS (on sf.net) to
> Mercurial (on sf.net). BTW that was driven primarily by Hannes.
> The benefits are:
> - since Mercurial is one this modern distributed SCMs it is for
> developers easier to mover changesets around.
> - since the anonymous access to the main repository[0] will go over a
> CGI script on the sf.net server, there shouldn't be any delay (of
> several hours) between a commit mail and the actual change in the
> publicly visible repository.
> - Mercurial automagically provides an RSS feed out of a repository.
> - Since I'm personally quite mail-centric, we will also send emails on
> changes to the main repository.
> To minimize changes and effort on all sides, I intend to keep even the
> (spam-protected) mailing list for commit mails.
>
> For a smooth transition, we need to update the documentation on
> http://www.olsr.org/ - including a simple introduction for people
> knowing CVS (WTH - I'm such a person). This should happen over Xmas
> holiday time.
>
> Why the above *thought*:
> The FunkFeuer net in Vienna started upgrade several nodes to 0.5.4 -
> including the gateway to the erst of the Internet. However, we
> experienced route flaps in the net afterwards.
>
> Summarizing from the internal FunkFeuer core list (in the Cc:, which is
> also German otherwise):
>
> It turned out that once in a while (the "while" can be AFAIK from a few
> minutes to lots of minutes) olsrd decides that one neighbor is not
> reachable (read: ETX == 0) and drops all routes to it. After a while,
> the connection is back and all routes are installed (and everything was
> as before).
> And that is quite noticable if that "dropped" neighbor is the main link
> to the Internet gateway.
> And this also happens on openvpn tunneled connections which are usually
> more like ETX == 1.00.
>
> The thread on http://www.freifunk-bno.de/forum/index.php?topic=930.0 (in
> German, found via Google) seems to be BTW the similar problem.
>
> Reverting to 0.5.0 on the Internet gateway solved (or at least seems to)
> the problem. So it seems to have to do with the olsrd version - and thus
> the decision to postpone a release until the clause of that it is clear.
>
> ATM no one knows (to the best of my knowledge) if that is a new bug in
> the implementation or combination of (vastly?) different versions or
> something hidden which is now only coming up or something completely
> different.
>
> The main question IMHO is: Why is olsrd deciding that ETX == 0 at some
> point in time even on a stable link?
>
> That requires adding debug code to a known br0ken version (above thread
> indicates the e.g. CVS-HEAD is one) and find the cause of it on a node
> that reliably shows that issue.
>
> If you are not so into programming and debugging:
> It would also help if someone experiencing such problem reliably and
> quickly could find the point in time in the CVS were it started to
> happen - or at least the two points where is definit^Wmost certainly not
> in (at or after 0.5.0) or - later on - is there (at or before 0.5.4).
> then everyone can look at the code.
>
> That requires getting some in-between version from the CVS and trying it
> out and see if it occurs. Write that down and take a later or an earlier
> one. Ideally one takes a center of the remaining interval to minimize
> the tries.
> Repeat until you feel you know the above result.
>
> Further helpful information is of course also welcome. Including
> corrections of above if I missed something or misunderstand something.
>
> Bernd
>
> [0]: I have to admit that I forget to ask Hannes what the official term
> for that in Mercurial speak is;-)
> --
> Firmix Software GmbH http://www.firmix.at/
> mobil: +43 664 4416156 fax: +43 1 7890849-55
> Embedded Linux Development and Services
>
>
>
> --
> Olsr-users mailing list
> (spam-protected)
> http://lists.olsr.org/mailman/listinfo/olsr-users
--
Firmix Software GmbH http://www.firmix.at/
mobil: +43 664 4416156 fax: +43 1 7890849-55
Embedded Linux Development and Services
More information about the Olsr-dev
mailing list