[olsr-dev] Re: olsrd 0.4.9 dotdraw plugin fd-leak
Thu Jun 2 15:11:06 CEST 2005
Thanks for reporting.
I guess this problem occurs due to the use of the MSG_NOSIGNAL flag beeing
used in send(2). Would it be possible for you to replace MSG_NOSIGNAL with
0(line 432, olsrd_dot_draw.c), recompile and test?
> While using olsrd on the current main gateway of Opennet Rostock, we
> noticed that it frequently terminated itself after running for a couple
> of hours. Running the daemon with debug output, the last line was
> "(DOT DRAW)IPC accept: Too many open files". This output was triggered
> by (what is called in the original distribution)
> "olsrd-0.4.9/lib/dot_draw/src/olsrd_dot_draw.c", specifically the lines
> if ((ipc_socket = socket(AF_INET, SOCK_STREAM, 0)) == -1)
> olsr_printf(1, "(DOT DRAW)IPC socket %s\n", strerror(errno));
> return 0;
> . Using lsof -n -p `pidof olsrd` while olsrd was still running showed an
> increasing number of lines similar to:
> olsrd 5543 root 10u IPv4 6145616 TCP
> localhost:2004->localhost:38723 (CLOSE_WAIT)
> olsrd 5543 root 11u IPv4 6145623 TCP
> localhost:2004->localhost:38724 (CLOSE_WAIT)
> olsrd 5543 root 12u IPv4 6145629 TCP
> localhost:2004->localhost:38725 (CLOSE_WAIT)
> olsrd 5543 root 13u IPv4 6145635 TCP
> localhost:2004->localhost:38726 (CLOSE_WAIT)
> olsrd 5543 root 14u IPv4 6145639 TCP
> localhost:2004->localhost:38727 (CLOSE_WAIT)
> I.e. the dotdraw plugin didn't close fds corresponding to connections to
> clients that had already been closed.
> There are several ways to reproduce this; for an olsrd that doesn't have
> contact to any other olsr daemons, repeatedly [opening a connection to
> the port listened on by dotdraw, waiting for the plugin's entire output
> being written, and then closing it] will trigger this behaviour.
> For an olsrd that does have contact to other daemons, the bug can
> reliably be reproduced by opening a connection to the port, and then
> opening a second one while the first one is kept open (e.g. by using two
> netcat instances). While dotdraw isn't designed to handle several
> connections in parallel, IMO leaking fds and eventually killing the
> olsrd in that case is still unacceptable behaviour.
> I've attached a simple fix to this bug; the patched version will close
> the previous fd used for a client connection before accepting another one.
> Sebastian Hagen, Opennet Rostock
More information about the Olsr-dev