[OLSR-users] Assertion `metric_counter' failed (IPV4)

Philippe Vanhaesendonck (spam-protected)
Thu May 19 19:06:55 CEST 2005


Andreas,

Thank you for your help on this issue.

I have ethernet access to both nodes, so yes, no problem to install
custom software!

I have applied the patch, and at the same time ran a script on the
neighbour  node to change its IP address:
while true
do
  ifconfig eth2 down
  sleep 60
  ifconfig eth2 10.11.12.12 up
  sleep 5
  ifconfig eth2 10.11.1.197
  sleep 120
done

after a while, bingo olsrd...
... segfault :-\

Mmmh -- looks like we are using the wrong variable in the patch for the
tmp_node iteration... Let's replace destination_ptr by tmp_node, 
re-build the package, restart the test and....
... Yes! we have a new occurence of the problem, with a complete trace
file this time

Trace file attached -- Hope this helps!

--
Phil

Andreas Tønnesen wrote:

>
> That's interesting info.
> Do you have the oppertunity to build a special software for one of your
> nodes and reproduce the problem? I have attached a patch that writes
> some info we could use in debugging this problem to stdout. This means
> that you can run olsrd with debuglevel 0(and the -nofork option) and
> only the new debug output will occur.
> Thanks.
>
> - Andreas
>
> Philippe Vanhaesendonck wrote:
>
>> Hello everybody,
>>
>> As follow-up to my initial message, I have observed that OLSR has not
>> crashed  since one of its neibghbour is not changing its IP anymore...
>> The uptime is now 40 hours, while it was crashing every couple of
>> hours in
>> the past.
>>
>> So this tends to confirm we have an issue when a node changes its IP
>> (V4).
>> Unfortunately that node cannot cope with the load in debug mode, so I
>> cannot trace the problem further.
>>
>> -- 
>> Phil.
>>
>> Philippe Vanhaesendonck said:
>>
>>> Mmmmm...
>>>
>>> This drives me crazy...
>>>
>>> On that particular node, when I run OLSR in Debug mode, all the links
>>> end up in having an NLQ and ETX of 0.0, so nothing in the routing
>>> tables
>>> and of course the assertion never comes...
>>> (see extract of output below)
>>> Maybe the hardware is just too slow when we are in debug mode: I have
>>> another AP of the same type, and:
>>>
>>>    * non debug: works ok
>>>    * debug and a few nodes accessible: works ok
>>>    * debug and all 25 nodes accessible: NLQ & ETX drop down to 0 and
>>>      routes are lost
>>>
>>>
>>> All that to say I can't reproduce the problem in debug mode...
>>> ... but I have discovered that one of the neighbour has indeed spurious
>>> changes of IP!
>>> I will fix that and see if it changes something.
>>>
>>> ---
>>> Phil.
>>
>>
>>
>> <SNIP>
>>
>>
>> _______________________________________________
>> olsr-users mailing list
>> (spam-protected)
>> https://www.olsr.org/mailman/listinfo/olsr-users
>
>
>------------------------------------------------------------------------
>
>Index: src/process_routes.c
>===================================================================
>RCS file: /cvsroot/olsrd/olsrd-current/src/process_routes.c,v
>retrieving revision 1.24
>diff -C5 -r1.24 process_routes.c
>*** src/process_routes.c	28 Feb 2005 14:42:57 -0000	1.24
>--- src/process_routes.c	19 May 2005 08:44:05 -0000
>***************
>*** 296,323 ****
>  {
>    struct destination_n *destination_ptr;
>    int metric_counter = 1;
>    olsr_bool last_run = OLSR_FALSE;
>  
>    /* Find highest metric */
>    for(destination_ptr = delete_kernel_list;
>        destination_ptr != NULL;
>        destination_ptr = destination_ptr->next)
>      {
>        if(destination_ptr->destination->rt_metric > metric_counter)
>  	metric_counter = destination_ptr->destination->rt_metric;
>      }
>  #ifdef DEBUG
>    OLSR_PRINTF(3, "%s highest metric %d\n",
>  	      __func__, metric_counter)
>  #endif
>   
>    while(delete_kernel_list!=NULL)
>      {
>        struct destination_n *previous_node = delete_kernel_list;
>  
>!       assert(metric_counter);
>  
>        /* searching for all the items with metric equal to n */
>        for(destination_ptr = delete_kernel_list; destination_ptr != NULL; )
>  	{
>  
>--- 296,345 ----
>  {
>    struct destination_n *destination_ptr;
>    int metric_counter = 1;
>    olsr_bool last_run = OLSR_FALSE;
>  
>+   printf("olsr_delete_routes_from_kernel, metrics:\n");
>    /* Find highest metric */
>    for(destination_ptr = delete_kernel_list;
>        destination_ptr != NULL;
>        destination_ptr = destination_ptr->next)
>      {
>        if(destination_ptr->destination->rt_metric > metric_counter)
>  	metric_counter = destination_ptr->destination->rt_metric;
>+       printf("\t%s : %d\n", olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>+ 	     destination_ptr->destination->rt_metric);
>      }
>+   printf("\nhighest metric %d\n", metric_counter);
>+ 
>  #ifdef DEBUG
>    OLSR_PRINTF(3, "%s highest metric %d\n",
>  	      __func__, metric_counter)
>  #endif
>   
>    while(delete_kernel_list!=NULL)
>      {
>        struct destination_n *previous_node = delete_kernel_list;
>  
>!       if(metric_counter == 0) 
>! 	{
>! 	  struct destination_n *tmp_node = delete_kernel_list;
>! 
>! 	  printf("Stale route(s) detected in %s!\n", __func__);
>! 
>! 	  while(tmp_node)
>! 	    {
>! 	      printf("Stale route to to %s via %s by %s hopcount %d!\n",
>! 		     olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>! 		     olsr_ip_to_string(&destination_ptr->destination->rt_router),
>! 		     destination_ptr->destination->rt_if->int_name,
>! 		     destination_ptr->destination->rt_metric);
>! 	      tmp_node = tmp_node->next;
>! 	    }
>! 
>! 	  assert(0);
>! 	}
>  
>        /* searching for all the items with metric equal to n */
>        for(destination_ptr = delete_kernel_list; destination_ptr != NULL; )
>  	{
>  
>***************
>*** 325,334 ****
>--- 347,363 ----
>  	     ((last_run || 
>  	       !COMP_IP(&destination_ptr->destination->rt_dst, 
>                          &destination_ptr->destination->rt_router))))
>  	    {
>  	      olsr_16_t error;
>+ 
>+ 	      printf("Deleting route to to %s via %s by %s hopcount %d!\n",
>+ 		     olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>+ 		     olsr_ip_to_string(&destination_ptr->destination->rt_router),
>+ 		     destination_ptr->destination->rt_if->int_name,
>+ 		     destination_ptr->destination->rt_metric);
>+ 
>  #ifdef DEBUG
>  	      OLSR_PRINTF(3, "Deleting route to %s hopcount %d\n",
>  			  olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>  			  destination_ptr->destination->rt_metric)
>  #endif
>  
>
>------------------------------------------------------------------------
>
>_______________________________________________
>olsr-users mailing list
>(spam-protected)
>https://www.olsr.org/mailman/listinfo/olsr-users
>  
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: olsrd.539.log.gz
Type: application/gzip
Size: 10764 bytes
Desc: not available
URL: <http://lists.olsr.org/pipermail/olsr-users/attachments/20050519/f7663ebc/attachment.gz>


More information about the Olsr-users mailing list