[OLSR-users] Assertion `metric_counter' failed (IPV4)

Andreas Tønnesen (spam-protected)
Thu May 19 22:44:09 CEST 2005


Philippe,

Thanks(and sorry for the cut'n past error). It is now clear to see what
causes the problem. The entry:
 >Stale route to to 10.11.1.197 via 10.11.1.197 by wlan0 hopcount 3!<
has a hopcount of 3 yet it has itself as nexthop. So there seems to be
a problem with the route calculation at some point...
This should really not be all that harmfull as long as the route in
question is actually deleted and it's easy to create a quick fix this
way.
But I'd like to get to the bottom of this. Would it be possible for you
to build yet another version and reproduce the problem? The attached
patch will generate routing table output to stdout. Hopefully there will
not be enough output to cause trouble. And hopefully there are no cut'n
paste bugs ;)
Thanks again.

- Andreas

Philippe Vanhaesendonck wrote:
> Andreas,
> 
> Thank you for your help on this issue.
> 
> I have ethernet access to both nodes, so yes, no problem to install
> custom software!
> 
> I have applied the patch, and at the same time ran a script on the
> neighbour  node to change its IP address:
> while true
> do
>   ifconfig eth2 down
>   sleep 60
>   ifconfig eth2 10.11.12.12 up
>   sleep 5
>   ifconfig eth2 10.11.1.197
>   sleep 120
> done
> 
> after a while, bingo olsrd...
> ... segfault :-\
> 
> Mmmh -- looks like we are using the wrong variable in the patch for the
> tmp_node iteration... Let's replace destination_ptr by tmp_node, 
> re-build the package, restart the test and....
> ... Yes! we have a new occurence of the problem, with a complete trace
> file this time
> 
> Trace file attached -- Hope this helps!
> 
> --
> Phil
> 
> Andreas Tønnesen wrote:
> 
> 
>>That's interesting info.
>>Do you have the oppertunity to build a special software for one of your
>>nodes and reproduce the problem? I have attached a patch that writes
>>some info we could use in debugging this problem to stdout. This means
>>that you can run olsrd with debuglevel 0(and the -nofork option) and
>>only the new debug output will occur.
>>Thanks.
>>
>>- Andreas
>>
>>Philippe Vanhaesendonck wrote:
>>
>>
>>>Hello everybody,
>>>
>>>As follow-up to my initial message, I have observed that OLSR has not
>>>crashed  since one of its neibghbour is not changing its IP anymore...
>>>The uptime is now 40 hours, while it was crashing every couple of
>>>hours in
>>>the past.
>>>
>>>So this tends to confirm we have an issue when a node changes its IP
>>>(V4).
>>>Unfortunately that node cannot cope with the load in debug mode, so I
>>>cannot trace the problem further.
>>>
>>>-- 
>>>Phil.
>>>
>>>Philippe Vanhaesendonck said:
>>>
>>>
>>>>Mmmmm...
>>>>
>>>>This drives me crazy...
>>>>
>>>>On that particular node, when I run OLSR in Debug mode, all the links
>>>>end up in having an NLQ and ETX of 0.0, so nothing in the routing
>>>>tables
>>>>and of course the assertion never comes...
>>>>(see extract of output below)
>>>>Maybe the hardware is just too slow when we are in debug mode: I have
>>>>another AP of the same type, and:
>>>>
>>>>   * non debug: works ok
>>>>   * debug and a few nodes accessible: works ok
>>>>   * debug and all 25 nodes accessible: NLQ & ETX drop down to 0 and
>>>>     routes are lost
>>>>
>>>>
>>>>All that to say I can't reproduce the problem in debug mode...
>>>>... but I have discovered that one of the neighbour has indeed spurious
>>>>changes of IP!
>>>>I will fix that and see if it changes something.
>>>>
>>>>---
>>>>Phil.
>>>
>>>
>>>
>>><SNIP>
>>>
>>>
>>>_______________________________________________
>>>olsr-users mailing list
>>>(spam-protected)
>>>https://www.olsr.org/mailman/listinfo/olsr-users
>>
>>
>>------------------------------------------------------------------------
>>
>>Index: src/process_routes.c
>>===================================================================
>>RCS file: /cvsroot/olsrd/olsrd-current/src/process_routes.c,v
>>retrieving revision 1.24
>>diff -C5 -r1.24 process_routes.c
>>*** src/process_routes.c	28 Feb 2005 14:42:57 -0000	1.24
>>--- src/process_routes.c	19 May 2005 08:44:05 -0000
>>***************
>>*** 296,323 ****
>> {
>>   struct destination_n *destination_ptr;
>>   int metric_counter = 1;
>>   olsr_bool last_run = OLSR_FALSE;
>> 
>>   /* Find highest metric */
>>   for(destination_ptr = delete_kernel_list;
>>       destination_ptr != NULL;
>>       destination_ptr = destination_ptr->next)
>>     {
>>       if(destination_ptr->destination->rt_metric > metric_counter)
>> 	metric_counter = destination_ptr->destination->rt_metric;
>>     }
>> #ifdef DEBUG
>>   OLSR_PRINTF(3, "%s highest metric %d\n",
>> 	      __func__, metric_counter)
>> #endif
>>  
>>   while(delete_kernel_list!=NULL)
>>     {
>>       struct destination_n *previous_node = delete_kernel_list;
>> 
>>!       assert(metric_counter);
>> 
>>       /* searching for all the items with metric equal to n */
>>       for(destination_ptr = delete_kernel_list; destination_ptr != NULL; )
>> 	{
>> 
>>--- 296,345 ----
>> {
>>   struct destination_n *destination_ptr;
>>   int metric_counter = 1;
>>   olsr_bool last_run = OLSR_FALSE;
>> 
>>+   printf("olsr_delete_routes_from_kernel, metrics:\n");
>>   /* Find highest metric */
>>   for(destination_ptr = delete_kernel_list;
>>       destination_ptr != NULL;
>>       destination_ptr = destination_ptr->next)
>>     {
>>       if(destination_ptr->destination->rt_metric > metric_counter)
>> 	metric_counter = destination_ptr->destination->rt_metric;
>>+       printf("\t%s : %d\n", olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>>+ 	     destination_ptr->destination->rt_metric);
>>     }
>>+   printf("\nhighest metric %d\n", metric_counter);
>>+ 
>> #ifdef DEBUG
>>   OLSR_PRINTF(3, "%s highest metric %d\n",
>> 	      __func__, metric_counter)
>> #endif
>>  
>>   while(delete_kernel_list!=NULL)
>>     {
>>       struct destination_n *previous_node = delete_kernel_list;
>> 
>>!       if(metric_counter == 0) 
>>! 	{
>>! 	  struct destination_n *tmp_node = delete_kernel_list;
>>! 
>>! 	  printf("Stale route(s) detected in %s!\n", __func__);
>>! 
>>! 	  while(tmp_node)
>>! 	    {
>>! 	      printf("Stale route to to %s via %s by %s hopcount %d!\n",
>>! 		     olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>>! 		     olsr_ip_to_string(&destination_ptr->destination->rt_router),
>>! 		     destination_ptr->destination->rt_if->int_name,
>>! 		     destination_ptr->destination->rt_metric);
>>! 	      tmp_node = tmp_node->next;
>>! 	    }
>>! 
>>! 	  assert(0);
>>! 	}
>> 
>>       /* searching for all the items with metric equal to n */
>>       for(destination_ptr = delete_kernel_list; destination_ptr != NULL; )
>> 	{
>> 
>>***************
>>*** 325,334 ****
>>--- 347,363 ----
>> 	     ((last_run || 
>> 	       !COMP_IP(&destination_ptr->destination->rt_dst, 
>>                         &destination_ptr->destination->rt_router))))
>> 	    {
>> 	      olsr_16_t error;
>>+ 
>>+ 	      printf("Deleting route to to %s via %s by %s hopcount %d!\n",
>>+ 		     olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>>+ 		     olsr_ip_to_string(&destination_ptr->destination->rt_router),
>>+ 		     destination_ptr->destination->rt_if->int_name,
>>+ 		     destination_ptr->destination->rt_metric);
>>+ 
>> #ifdef DEBUG
>> 	      OLSR_PRINTF(3, "Deleting route to %s hopcount %d\n",
>> 			  olsr_ip_to_string(&destination_ptr->destination->rt_dst),
>> 			  destination_ptr->destination->rt_metric)
>> #endif
>> 
>>
>>------------------------------------------------------------------------
>>
>>_______________________________________________
>>olsr-users mailing list
>>(spam-protected)
>>https://www.olsr.org/mailman/listinfo/olsr-users
>> 
>>
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> olsr-users mailing list
> (spam-protected)
> https://www.olsr.org/mailman/listinfo/olsr-users

-- 
Andreas Tønnesen
http://www.olsr.org
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: debug_output.diff
URL: <http://lists.olsr.org/pipermail/olsr-users/attachments/20050519/fc9239f6/attachment.ksh>


More information about the Olsr-users mailing list