[olsr-dev] Re: olsrd 0.4.9 dotdraw plugin fd-leak

Andreas Tønnesen (spam-protected)
Thu Jun 2 15:11:06 CEST 2005


Hi Sebastian,

Thanks for reporting.
I guess this problem occurs due to the use of the MSG_NOSIGNAL flag beeing
used in send(2). Would it be possible for you to replace MSG_NOSIGNAL with
0(line 432, olsrd_dot_draw.c), recompile and test?

- Andreas


> Hello,
>
> While using olsrd on the current main gateway of Opennet Rostock, we
> noticed that it frequently terminated itself after running for a couple
> of hours. Running the daemon with debug output, the last line was
> "(DOT DRAW)IPC accept: Too many open files". This output was triggered
> by (what is called in the original distribution)
> "olsrd-0.4.9/lib/dot_draw/src/olsrd_dot_draw.c", specifically the lines
>
> if ((ipc_socket = socket(AF_INET, SOCK_STREAM, 0)) == -1)
>   {
>     olsr_printf(1, "(DOT DRAW)IPC socket %s\n", strerror(errno));
>     return 0;
>   }
>
> . Using lsof -n -p `pidof olsrd` while olsrd was still running showed an
> increasing number of lines similar to:
>
> olsrd   5543 root   10u  IPv4    6145616             TCP
> localhost:2004->localhost:38723 (CLOSE_WAIT)
> olsrd   5543 root   11u  IPv4    6145623             TCP
> localhost:2004->localhost:38724 (CLOSE_WAIT)
> olsrd   5543 root   12u  IPv4    6145629             TCP
> localhost:2004->localhost:38725 (CLOSE_WAIT)
> olsrd   5543 root   13u  IPv4    6145635             TCP
> localhost:2004->localhost:38726 (CLOSE_WAIT)
> olsrd   5543 root   14u  IPv4    6145639             TCP
> localhost:2004->localhost:38727 (CLOSE_WAIT)
>
> I.e. the dotdraw plugin didn't close fds corresponding to connections to
> clients that had already been closed.
> There are several ways to reproduce this; for an olsrd that doesn't have
> contact to any other olsr daemons, repeatedly [opening a connection to
> the port listened on by dotdraw, waiting for the plugin's entire output
> being written, and then closing it] will trigger this behaviour.
>
> For an olsrd that does have contact to other daemons, the bug can
> reliably be reproduced by opening a connection to the port, and then
> opening a second one while the first one is kept open (e.g. by using two
> netcat instances). While dotdraw isn't designed to handle several
> connections in parallel, IMO leaking fds and eventually killing the
> olsrd in that case is still unacceptable behaviour.
> I've attached a simple fix to this bug; the patched version will close
> the previous fd used for a client connection before accepting another one.
>
> Sebastian Hagen, Opennet Rostock
>


---------
Andreas Tønnesen
http://www.olsr.org



More information about the Olsr-dev mailing list