[Olsr-users] lxc network emulation

Sat Oct 22 12:25:37 CEST 2011

Hello Markus
I must admit that you have valid, precise and good arguments.
I have overheads since all the interfaces for each machine has a process
(small) associated to it to emulate the network trough a socket (linked to
the taps of lxc machines).

The dhcp is for the admin interface managed automaticaly, only names
are handled in cloonix to reach a machine, the admin ip is distributed
by cloonix and you can ssh to your machine with the name only.

Also I use cloonix graph processes to see the image of the machines and
this too uses up some cpu because the graph displays the packets going
through each interface of each machine with blinks.

The real goal of cloonix is automated configuration of many (ok not so
many) machines and ease the dynamical topology modifications and ease the
management, access and visualisation of machines and networks.

You are right about the too big difference on performances we have, it
just shows that double-progress will be made in the future: the software
can be optimized at the same time as machine power progress.

Your experiment proves the same thing as mine, that real time emulation
will probably replace discrete event simulators in the next years.

> On Sat, Oct 22, 2011 at 12:01 AM, <(spam-protected)> wrote:
>
>> In my machines there is also sshd and other things that are usually
>> there :
>>
> hmm i do not expect ssh server to be ressource hungry
>
> but a serial console is enough, imho
> (or even nothing,.. (-;)
>
> i had around 2-3MB ram usage per container (debian minimal rootfs) for a
> simple system (just init, getty, rsyslog, olsrd)
> and 1.5MB for containers with just olsrd running,.. (lxc-init, olsrd)
>
> (those numbers are without additional olsrd/system RAM usage for
> topology/routes)
>
> so in both cases RAM (i had 16GB) was not my limit, (except for extremely
> dense topologies,..)
>
> (but having only 2 cpu cores, was the limit)
>
>>
>> [Clown1.1.1> ps -ef
>> UID        PID  PPID  C STIME TTY          TIME CMD
>> root         1     0  0 21:47 ?        00:00:00 init [2]
>> root       253     1  0 21:47 ?        00:00:00 /usr/sbin/rsyslogd -c4
>> root       264     1  0 21:47 ?        00:00:00 /usr/sbin/cron
>
> running cron , is asking for troubles
> (as it might start something simultaneous on all systems)
>
>>
>
> root       324     1  0 21:47 ?        00:00:00 dhclient -v -pf
>> /var/run/dhclien
>>
>
> hmm who needs a dhcp cllient, when running ospf/olsrd?
>
> except this, and that you are logged in (-;
>
> your system looks nice && small aswell.
>
> root       361     1  0 21:47 ?        00:00:00 /usr/sbin/sshd
>> root       373     1  0 21:48 ?        00:00:00 /usr/sbin/olsrd -d 0
>> root       376     1  0 21:48 console  00:00:00 /bin/login -f
>> console
>> root       377   376  0 21:48 console  00:00:00 -bash
>> root       380   361  0 21:48 ?        00:00:00 sshd: (spam-protected)/0
>> root       382   380  0 21:48 pts/0    00:00:00 -bash
>> root       387   382  0 21:48 pts/0    00:00:00 ps -ef
>> [Clown1.1.1>
>>
>> May-be this explains the difference, and also may-be more machines can
>> be
>> handled with olsr. I had a big problem with ospf because when the
>> topology
>> is big ospf eats up too much memory and cpu per machine, so I did not
>> push
>> the olsr protocol much further.
>>
> ok (-;
>
> btw. i found that olsrds internal scheduler (of stable branch) uses too
> much
> cpu aswell, especially in such setups.
> so infact on my tegra2 setup, 75% of the cpuload, was introduced by the
> schedulers.
> (calculating the routes, parsing/generating messages, and so on, used only
> 20%)
>
> btw my above numbers (250/2000 instances) are with olsrd 0.6.2 from stable
> branch with default config.
> with raising the PollRate, even more olsrd instances are possible on same
> cpu.
> furthermore next major releases of olsrd will have better scheduling, too.
>
>>
>> The goal of cloonix is to manage the networking and the
>> creation/destruction of the machines, not really to setup a record of
>> machines.
>>
> sure.
>
> my inital idea was to evaluate if one can run/simulate a reasonable
> complex
> mesh-network on a "smartphone"
> e.g. 250 olsrd instances, 350 interfaces, ~2000 links.
> on a tegra2 system (cortex a9, 2x 1ghz, 512MB RAM)
>
> so yes, it was a bit like record hunting (on limited hardware,..)
>
> and afterwards i just checked how it would scale on a real PC.
>
> and as your numbers of instances where (normalized by cpu performance)
> only
> around 1/15th of my setup,.
> (which might indicate severe overhead somewhere in your setup)
>
> i just had to say something (-;
>
> per machine 2000 machines = 10 giga!
>>
> btw i ran ~ 2000 on a dual core (5Ghz) system,
>
> a empty machine itself should need quite nothing
> (if its a linux container and not really a (virtual) machine)
>
> And also my file-systems are duplicated except for /bin /sbin /lib and
>> /usr,
>
> and this takes some hard disk, 2000 would be far too much.
>>
> (depends on the setup,..)
>
> btw i ran my system on (nearly) complete readonly filesystems (-;
> (i just had ~100KB harddisk usage per instance (-;)
>
> or i just started a olsrd process inside a container (with its own network
> stack),
>
> which imho for many testcases is just enough, as olsrd has its plugins
> that
> allow querying its state via network,..
> and booting hundreds/thousands of systems (even if they aren`t
> RAM-hungry),
> just takes too much time,.. (-;
>
> Markus
>