[OLSR-users] Assertion `metric_counter' failed (IPV4)
Philippe Vanhaesendonck
(spam-protected)
Thu May 19 19:06:55 CEST 2005
Andreas,
Thank you for your help on this issue.
I have ethernet access to both nodes, so yes, no problem to install
custom software!
I have applied the patch, and at the same time ran a script on the
neighbour node to change its IP address:
while true
do
ifconfig eth2 down
sleep 60
ifconfig eth2 10.11.12.12 up
sleep 5
ifconfig eth2 10.11.1.197
sleep 120
done
after a while, bingo olsrd...
... segfault :-\
Mmmh -- looks like we are using the wrong variable in the patch for the
tmp_node iteration... Let's replace destination_ptr by tmp_node,
re-build the package, restart the test and....
... Yes! we have a new occurence of the problem, with a complete trace
file this time
Trace file attached -- Hope this helps!
--
Phil
Andreas Tønnesen wrote:
>
> That's interesting info.
> Do you have the oppertunity to build a special software for one of your
> nodes and reproduce the problem? I have attached a patch that writes
> some info we could use in debugging this problem to stdout. This means
> that you can run olsrd with debuglevel 0(and the -nofork option) and
> only the new debug output will occur.
> Thanks.
>
> - Andreas
>
> Philippe Vanhaesendonck wrote:
>
>> Hello everybody,
>>
>> As follow-up to my initial message, I have observed that OLSR has not
>> crashed since one of its neibghbour is not changing its IP anymore...
>> The uptime is now 40 hours, while it was crashing every couple of
>> hours in
>> the past.
>>
>> So this tends to confirm we have an issue when a node changes its IP
>> (V4).
>> Unfortunately that node cannot cope with the load in debug mode, so I
>> cannot trace the problem further.
>>
>> --
>> Phil.
>>
>> Philippe Vanhaesendonck said:
>>
>>> Mmmmm...
>>>
>>> This drives me crazy...
>>>
>>> On that particular node, when I run OLSR in Debug mode, all the links
>>> end up in having an NLQ and ETX of 0.0, so nothing in the routing
>>> tables
>>> and of course the assertion never comes...
>>> (see extract of output below)
>>> Maybe the hardware is just too slow when we are in debug mode: I have
>>> another AP of the same type, and:
>>>
>>> * non debug: works ok
>>> * debug and a few nodes accessible: works ok
>>> * debug and all 25 nodes accessible: NLQ & ETX drop down to 0 and
>>> routes are lost
>>>
>>>
>>> All that to say I can't reproduce the problem in debug mode...
>>> ... but I have discovered that one of the neighbour has indeed spurious
>>> changes of IP!
>>> I will fix that and see if it changes something.
>>>
>>> ---
>>> Phil.
>>
>>
>>
>> <SNIP>
>>
>>
>> _______________________________________________
>> olsr-users mailing list
>> (spam-protected)
>> https://www.olsr.org/mailman/listinfo/olsr-users
>
>
>------------------------------------------------------------------------
>
>Index: src/process_routes.c
>===================================================================
>RCS file: /cvsroot/olsrd/olsrd-current/src/process_routes.c,v
>retrieving revision 1.24
>diff -C5 -r1.24 process_routes.c
>*** src/process_routes.c 28 Feb 2005 14:42:57 -0000 1.24
>--- src/process_routes.c 19 May 2005 08:44:05 -0000
>***************
>*** 296,323 ****
> {
> struct destination_n *destination_ptr;
> int metric_counter = 1;
> olsr_bool last_run = OLSR_FALSE;
>
> /* Find highest metric */
> for(destination_ptr = delete_kernel_list;
> destination_ptr != NULL;
> destination_ptr = destination_ptr->next)
> {
> if(destination_ptr->destination->rt_metric > metric_counter)
> metric_counter = destination_ptr->destination->rt_metric;
> }
> #ifdef DEBUG
> OLSR_PRINTF(3, "%s highest metric %d\n",
> __func__, metric_counter)
> #endif
>
> while(delete_kernel_list!=NULL)
> {
> struct destination_n *previous_node = delete_kernel_list;
>
>! assert(metric_counter);
>
> /* searching for all the items with metric equal to n */
> for(destination_ptr = delete_kernel_list; destination_ptr != NULL; )
> {
>
>--- 296,345 ----
> {
> struct destination_n *destination_ptr;
> int metric_counter = 1;
> olsr_bool last_run = OLSR_FALSE;
>
>+ printf("olsr_delete_routes_from_kernel, metrics:\n");
> /* Find highest metric */
> for(destination_ptr = delete_kernel_list;
> destination_ptr != NULL;
> destination_ptr = destination_ptr->next)
> {
> if(destination_ptr->destination->rt_metric > metric_counter)
> metric_counter = destination_ptr->destination->rt_metric;
>+ printf("\t%s : %d\n", olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>+ destination_ptr->destination->rt_metric);
> }
>+ printf("\nhighest metric %d\n", metric_counter);
>+
> #ifdef DEBUG
> OLSR_PRINTF(3, "%s highest metric %d\n",
> __func__, metric_counter)
> #endif
>
> while(delete_kernel_list!=NULL)
> {
> struct destination_n *previous_node = delete_kernel_list;
>
>! if(metric_counter == 0)
>! {
>! struct destination_n *tmp_node = delete_kernel_list;
>!
>! printf("Stale route(s) detected in %s!\n", __func__);
>!
>! while(tmp_node)
>! {
>! printf("Stale route to to %s via %s by %s hopcount %d!\n",
>! olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>! olsr_ip_to_string(&destination_ptr->destination->rt_router),
>! destination_ptr->destination->rt_if->int_name,
>! destination_ptr->destination->rt_metric);
>! tmp_node = tmp_node->next;
>! }
>!
>! assert(0);
>! }
>
> /* searching for all the items with metric equal to n */
> for(destination_ptr = delete_kernel_list; destination_ptr != NULL; )
> {
>
>***************
>*** 325,334 ****
>--- 347,363 ----
> ((last_run ||
> !COMP_IP(&destination_ptr->destination->rt_dst,
> &destination_ptr->destination->rt_router))))
> {
> olsr_16_t error;
>+
>+ printf("Deleting route to to %s via %s by %s hopcount %d!\n",
>+ olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>+ olsr_ip_to_string(&destination_ptr->destination->rt_router),
>+ destination_ptr->destination->rt_if->int_name,
>+ destination_ptr->destination->rt_metric);
>+
> #ifdef DEBUG
> OLSR_PRINTF(3, "Deleting route to %s hopcount %d\n",
> olsr_ip_to_string(&destination_ptr->destination->rt_dst),
> destination_ptr->destination->rt_metric)
> #endif
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>olsr-users mailing list
>(spam-protected)
>https://www.olsr.org/mailman/listinfo/olsr-users
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: olsrd.539.log.gz
Type: application/gzip
Size: 10764 bytes
Desc: not available
URL: <http://lists.olsr.org/pipermail/olsr-users/attachments/20050519/f7663ebc/attachment.gz>
More information about the Olsr-users
mailing list