[Olsr-dev] Compressing OLSRd messages
Amadeus
(spam-protected)
Sun Apr 17 20:59:58 CEST 2011
Hi,
I'm forwarding this message to you guys from our local Freifunk
community in Chemnitz. This abstract is about making plans for
compressing OLSRd payload using address compression and other tricks in
order to be more efficient even on poor links like GPRS connections.
cat << EOF:
Hello,
In Chemnitz (Saxony, Germany) we are currently building up a local
Freifunk network. Unfortutnally since there's no local internet uplink
available at the moment we often have to resort to some UMTS mobile
uplink which is rate-limited and slow once we get over our limit. Under
normal conditions the UMTS uplink is more than enough to carry the
routing traffic and the actual data but whenever - because of the real
customer-friendly contract terms of mobile providers - the connection
gets rate-limited to EDGE or GPRS the connection becomes unstable and
sometimes breaks down by the load of the routing traffic alone.
Because of these issues we sat down and did some traffic analysis for
our network in Chemnitz using tcpdump (and some custom scripts I'll
explain a bit later). The first step was getting the routing traffic
from one of our (central) nodes:
# tcpdump -ni tun0 -s 65535 -w olsr.pcap
For a total of 8 active nodes (not including clients connected using
ARP-Roaming) we got about 50 MB of traffic only at that particular node.
Given that we rarely use the network for continious duty, but rather for
spontanious internet connections this appears to be quite a lot of
traffic just to keep the network "alive".
For the whole network we get a background noise of about 500 to 1000
bytes per second at a rate of approx. 2-5 packets/second. Doesn't sound
that much, but given that we have to maintain a VPN (tinc) between our
local nodes in Chemnitz and the gateway infrastructure at the data
center we get a multiplier of up to 8 on this traffic rate - and
therefore filling up the whole GPRS link at once only with routing
traffic by OLSRd and tinc. That way the link gets unuseable with latency
of 15 seconds(!) and more (yeah, we did the tests ;-)). A further look
into the traffic rate distribution also revealed that large or
spontanious changes (e.g. caused by congested or failing links) may also
cause traffic spikes up to four times above the background noise
worsening the situation even more. This factor seems to follow linear to
the network size, although we aren't quite conclusive about this yet. An
analysis of the traffic itself further revealed that about 60-80 percent
of sent packets have a size of 80 to 320 bytes on-the-wire. 15% are 320
to 640 bytes. Even bigger packets are rare but exist. The biggest
network in a dump we made was 1102 bytes of payload.
But now for our suggestions:
Having a look at RFC3626 shows quite some room for improvements. The
first of those is the lack of support for running OLSR in a Dual-Stack
manner with IPv4 and IPv6 simultaniously. Instead of reusing the message
channel of one daemon you have to open up a second channel by running a
second instance of the daemon. With some protocol changes (read on for
details) you can transport both IPv4 and IPv6 at once and even save on
bandwidth. Instead if using RFC3626 you only can read the message
contents if you know the underlaying OSI 3 protocol. This makes OLSR
ambiguous when parsing and thus error-prone [1]. A solution here might
be putting the address family or the size of an address into the packet
header, but read on.
Much more critical than this ambiguity of the address format used is the
number of places where such addresses are used. Not only does it use a
lot of IP addresses but in most cases it repeats those addresses alot or
uses addresses mainly from the same or simular subnets thus creating a
lot of redundancy in the package data. When the people from Augsburg
sent me a dump of their routing traffic to compare my results to, I
asked them to bzip the resulting file which made the dump smaller by
about factor 10. A usual approach by most compression programs to
eliminate such redundancy is building a table of "repeating strings" and
only store references into the table that describes all the occurances.
For OLSR this means building up an index of IP addresses used inside the
routing packet (not on message level, reasons below) and instead of
writing the IP address just store a 1 byte (or two bte) value
referencing it. as most addresses inside a package also share the same
prefix building a tree-like structure is some promising option.
Given some network dumps (for Chemnitz and Augsburg) I did some
"counting" with some scripts I wrote (in PHP, but they do their job
quite well) that give an overview on the bandwidth used by OLSR. This
overview confirms mostly what could be suspected by only having a look
at the traffic dumps with tcpdump and wireshark:
- Most traffic is due to TC or TC-LQ messages
- In about 25% of the cases OLSR produces packets with only a single
message.
More on those two issues below when I present the output from the
statistics. But as you can probably guess: Not repeating the IPs over
and over might save a lot of space; especially in the case of IPv6 where
addresses are significantly larger than a one or two byte index into a
table. Using variable length integers you could even avoid limitsting
yourself by the size of those table indices. But that's more a question
for the actual implementation.
But even with this quite simple scheme of compressing the used addresses
there are some issues that need to be addressed:
1. If you only create a list of IPs used in the packet you still have
the overhead created due to simular prefixes used
2. You have to update packets for forwarding so the IP addresses used
match the list indices when you received those packages
The first problem can be solved quite easily (as mentioned above) by
using a tree-like structure where you split each address in different
parts (e.g. one byte each) and report for each level how many different
of such parts exist in the next level. Each leaf of this tree specifies
one IP address.
For example: Let's assume we have to encode the following IPs for our
package:
10.8.40.1, 10.8.40.2, 10.8.40.3, 10.8.40.4, 10.8.40.5, 10.8.40.7,
10.8.40.20, 10.8.40.23, 10.8.6.5, 10.8.40.6, 10.8.128.80, 10.8.129.97
As you can easily see there's a lot of packets strting with the prefix
10.8. But to simplify our example we assume our splitting is done on
byte level. That way we get:
10 (8 (6 (5, 6), 40 (1, 2, 3, 4, 5, 7, 20, 23), 128 (80), 129 (97) ) )
In order to be able to properly parse those 18 nodes of the tree when
trying to read the packet we have to add the number of direct children
for each node. In our case we simply get:
1 10 1 8 4 6 2 5 6 40 8 1 2 3 4 5 7 20 23 128 1 80 129 1 97
Since there's only up to 256 possible values for the number of direct
childs we can store the cound of childs with one byte each. This results
in only about 25 bytes instead of 48 bytes for the 12 IPv4 addresses we
stored in this tree. In case of IPv6 the savings should be even bigger
given the much larger addresses there. Typically the coding of the tree
is smaller than the uncompressed list starting from the third entry you
add. But even if the tree is slightly larger than the uncompressed list:
Since you usually repeat addresses in the OLSR packet you get to save on
bandwidth there.
As mentioned above there are some statistics on common traffic patterns
for OLSR. Even if one might assume that the worst case behaviour of this
compression is 2.5 the original IP address (not including additional
packet and message headers) the typical situation looks a bit different.
But even if there was such a case you still could fall back
(transparently) to uncompressed coding for single packets thus avoiding
this disadvantage.
A major advantage of this way of compressing traffic is, despite its
slightly higher needs to perform the packet encoding, that it can be
applied to all parts of a packet where addresses of some kind appear. If
you - additionally - also allow for multiple such compressed address
tables per packet you even could intermix IPv4 and IPv6 in the same
packet. The compression also still works well in cases where you
announce foreign prefixes via HNA. According to [2] there has been
discussion about storing HNA messages in CIDR notation in cases of IPv6.
Given that its a waste of space it's only natural to store CIDR for both
IPv4 and IPv6 thus saving (up to) 6 bytes for IPv4 and (up to) 30 bytes
for IPv6 (per HNA entry).
An example for this coding would be the case if I was to announce the
HNA for network 10.8.128.80/255.255.255.240 for my note. Currently this
would create a HNA entry of 8 bytes or 5 bytes if I only stored
10.8.128.80/28. With address compression I can take this even a step
further and compress it to 11/28 (2 bytes) referencing the above tree
structure). The 11 in this example means "the eleventh entry in the IP
table structure of this message".
In order to support IP lists of more than 128 entries I suggest the use
of VLQ [3] or Variable Length Coded integers as e.g. used in MKV or
MIDI. The advantage here lays in the way they represent small numbers.
As long as you have a small table you only need to store one byte for
the index, but you still can reference larger numbers by using
additional bytes indicating a continuation. The basic format here is to
store the data portion in the lower 7 bits of a byte and use the MSb to
store wheher there is more data to come or not. Naturally the
best-performing implementation here is to use big-endian for storage,
i.e. the most-significant data comes first.
An example on this: Let's store the number 286 (binary 1 0011 0110)
would be stored as "_1_000 0010 _0_001 0110" where _?_ means
"continuation". In order to limit the required resources to parse such
numbers implementations SHOULD be able to handle numbers of up to 4
bytes. An implementation MUST ALWAYS use the shortest possible encoding
of a number, but SHOULD be able to understand non-normalized versions of
the same number. I.e. 0x80 80 01 is valid input to the decoder but MUST
NEVER be produced by an encoder.
In order to integrate this address compression into OLSR I suggest the
following additions:
1. a new message type COMPRESS_IPLIST with the following structure:
uint8_t address_family; //AF_INET, AF_INET6, ...
uint8_t address_size; //Number of Bits in the address
uint8_t[] addr_tree; //a tree encoded as shown above
No padding is performed except for the usual padding between OLSR
messages
2. Each COMPRESS_IPLIST message defines a separatly useable list of
addresses can can be used inside of compressed messages
3. a new message type COMPRESS_HELLO
Simular to the normal HELLO message except without internal padding and
all addresses replaced by references to addresses in the message's
COMPRESS_IPLIST
4. a new message type COMPRESS_TC
Simular to the normal TC message except without internal padding and all
addresses replaced by references to addresses in the message's
COMPRESS_IPLIST
5. a new message type COMPRESS_MID
Simular to the normal MID message except without internal padding and
all addresses replaced by references to addresses in the message's
COMPRESS_IPLIST
6. No padding is applied at the end of messages if no further messages
follow
7. There MUST be at least on COMPRESS_IPLIST message present in front of
any other COMPRESS_* typed message.
8. Adding additional COMPRESS_IPLIST messages after COMPRESS_* type
messages is allowed and SHOULD be supported by implementations, but
additional COMPRESS_IPLIST messages MUST NOT be referenced before they
are placed in the packet (no forward referencing).
9. COMPRESS_* messages MUST contain a field specifying which
COMPRESS_IPLIST message they refer to.
10. An implementation SHOULD generate a minimum number of IPLISTs
necessary to contain all referenced IPs. It MAY however choose to split
IPlists into parts if this improves on achievable compression. This also
includes that multiple IPLISTs MAY include duplicate entries for an
address if this is necessary; although this situation SHOULD be avoided
if possible.
11. There SHOULD be a version and a feature field in packet headers in
order to distinguish different versions of the protocol and features
supported by the used OLSRd software. The feature field SHOULD
distinguish between used and supported features allowing for a
transparent fallback when some clients don't support a certain feature.
12. Compatibility with existing clients IS NOT subject of this proposal
(as unfortunate as this sounds). In order to achieve compatibility with
older clients you'd have to duplicate information which is in contrast
to compressing information.
Comments on this draft work are highly appreciated as this is still work
in progress. Especially the version management part is still to be done.
But some more details on that issue in another mail as this might get a
bit to far for the sole topic of compressing the OLSR contents. I'll
present some more thoughts on this later though.
But here (as promised above) some more details on two (real-life) OLSR
networks I did some analysis in:
Freifunk Chemnitz (our local network):
The dump is roughtly one week of data captured
---
Packets: 491086 Messages: 1595055 Min: 1 Max: 24 Avg: 3,25
Generic Distribution of Messages (Messages per Packet)
0: 0 1:186191 2:164317 3:27816 4: 4565 5:
5108 6: 7241 7:11504 8:18372 9:22205 10:19341 11:12650 12:
6715 13: 3080 14: 1237 15: 458 16: 168 17: 64 18: 24 19:
11 20: 7 21: 6 22: 4 23: 0 24: 2
Distribution of Messages by Type (Messages per Packet)
01: 0:491086 1: 0 2: 0 3: 0 4: 0 5: 0
6: 0 7: 0 8: 0 9: 0 10: 0 11: 0 12: 0
13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0
02: 0:491086 1: 0 2: 0 3: 0 4: 0 5: 0
6: 0 7: 0 8: 0 9: 0 10: 0 11: 0 12: 0
13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0
03: 0:462903 1:27878 2: 305 3: 0 4: 0 5: 0
6: 0 7: 0 8: 0 9: 0 10: 0 11: 0 12: 0
13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0
04: 0:351031 1:94805 2:30236 3:11773 4: 2800 5: 394
6: 42 7: 3 8: 2 9: 0 10: 0 11: 0 12: 0
13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0
c9: 0:270771 1:220314 2: 1 3: 0 4: 0 5:
0 6: 0 7: 0 8: 0 9: 0 10: 0 11: 0 12: 0
13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0
ca: 0: 268 1:374323 2: 5246 3: 5811 4: 5969 5: 8364
6:16483 7:38488 8:24081 9: 8627 10: 2568 11: 588 12: 145
13: 66 14: 23 15: 15 16: 11 17: 4 18: 4 19: 2
20: 0 21: 0 22: 0 23: 0 24: 0
Distribution of Size of Messages by Type:
01: Messages: 0 SumSize: 0 PktSize:
0 Min: 65536 Max: 0 Avg: 0,00
02: Messages: 0 SumSize: 0 PktSize:
0 Min: 65536 Max: 0 Avg: 0,00
03: Messages: 28488 SumSize: 455808 PktSize:
10140584 Min: 16 Max: 16 Avg: 16,00
04: Messages: 204055 SumSize: 8486820 PktSize:
51898560 Min: 28 Max: 84 Avg: 41,59
c9: Messages: 220316 SumSize: 15225328 PktSize:
56559276 Min: 16 Max: 92 Avg: 69,11
ca: Messages: 1142196 SumSize: 66372880 PktSize:
92486936 Min: 16 Max: 80 Avg: 58,11
Number of IP addresses per Packet/Message:
Packets: Min: 1 Max: 143 Avg: 20,38
U Packets: Min: 1 Max: 28 Avg: 8,62
Messages: Min: 1 Max: 24 Avg: 6,27
U Messages: Min: 1 Max: 17 Avg: 6,14
---
This dump suggests best case close to 7(minimum address table
size)+28(number of unique IPs)+24(Max packets)+143(IPs in one packet)
bytes of IPs (compressed) vs. 143*4 bytes uncompressed which gives a
ratio of about 1/2.8 for compression. Average case is about 7+8+4+21 vs.
21*4 (1/2.1). Thus even if the headers have to be added (and some
padding) you're most times above 50% compression.
Freifunk Augsburg (thanks to soma on IRC):
The dump is roughtly 1.5 weeks of data captured:
---
Packets: 2155874 Messages: 10594596 Min: 1 Max: 41 Avg:
4,91
Generic Distribution of Messages (Messages per Packet)
0: 0 1:587730 2:364316 3:216380 4:187529
5:167049 6:143465 7:119061 8:92465 9:66881 10:44863 11:28824
12:17422 13: 9835 14: 5697 15: 3583 16: 2177 17: 1283 18: 1021
19: 1125 20: 1075 21: 1176 22: 1306 23: 1606 24: 2188 25: 3768
26: 7006 27:12489 28:18223 29:19710 30:14930 31: 7765 32: 2934
33: 779 34: 155 35: 28 36: 9 37: 6 38: 9 39: 5
40: 0 41: 1
Distribution of Messages by Type (Messages per Packet)
01: 0:2155874 1: 0 2: 0 3: 0 4: 0 5:
0 6: 0 7: 0 8: 0 9: 0 10: 0 11: 0 12: 0
13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0 25: 0 26: 0
27: 0 28: 0 29: 0 30: 0 31: 0 32: 0 33: 0
34: 0 35: 0 36: 0 37: 0 38: 0 39: 0 40: 0
02: 0:2155874 1: 0 2: 0 3: 0 4: 0 5:
0 6: 0 7: 0 8: 0 9: 0 10: 0 11: 0 12: 0
13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0 25: 0 26: 0
27: 0 28: 0 29: 0 30: 0 31: 0 32: 0 33: 0
34: 0 35: 0 36: 0 37: 0 38: 0 39: 0 40: 0
03: 0:1457555 1:459799 2:138561 3:49218 4:27138
5:14883 6: 6258 7: 1929 8: 452 9: 72 10: 5 11: 2
12: 1 13: 1 14: 0 15: 0 16: 0 17: 0 18: 0
19: 0 20: 0 21: 0 22: 0 23: 0 24: 0 25: 0
26: 0 27: 0 28: 0 29: 0 30: 0 31: 0 32: 0
33: 0 34: 0 35: 0 36: 0 37: 0 38: 0 39: 0
40: 0
04: 0:1602656 1:433719 2:94192 3:22346 4: 2789 5:
146 6: 19 7: 2 8: 4 9: 1 10: 0 11: 0 12:
0 13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0 25: 0 26: 0
27: 0 28: 0 29: 0 30: 0 31: 0 32: 0 33: 0
34: 0 35: 0 36: 0 37: 0 38: 0 39: 0 40: 0
82: 0:1955548 1:180805 2:17452 3: 1899 4: 156 5:
13 6: 1 7: 0 8: 0 9: 0 10: 0 11: 0 12:
0 13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0 25: 0 26: 0
27: 0 28: 0 29: 0 30: 0 31: 0 32: 0 33: 0
34: 0 35: 0 36: 0 37: 0 38: 0 39: 0 40: 0
c9: 0:1577253 1:578618 2: 3 3: 0 4: 0 5:
0 6: 0 7: 0 8: 0 9: 0 10: 0 11: 0 12: 0
13: 0 14: 0 15: 0 16: 0 17: 0 18: 0 19: 0
20: 0 21: 0 22: 0 23: 0 24: 0 25: 0 26: 0
27: 0 28: 0 29: 0 30: 0 31: 0 32: 0 33: 0
34: 0 35: 0 36: 0 37: 0 38: 0 39: 0 40: 0
ca: 0:49151 1:789127 2:295667 3:256398 4:215405
5:170807 6:123312 7:77825 8:42804 9:21016 10: 8941 11: 4345
12: 2242 13: 1341 14: 1263 15: 1411 16: 1502 17: 1789 18: 2735
19: 5150 20: 9418 21:15848 22:20824 23:19928 24:12572 25: 4541
26: 474 27: 20 28: 13 29: 1 30: 1 31: 2 32: 1
33: 0 34: 0 35: 0 36: 0 37: 0 38: 0 39: 0
40: 0
Distribution of Size of Messages by Type:
01: Messages: 0 SumSize: 0 PktSize:
0 Min: 65536 Max: 0 Avg: 0,00
02: Messages: 0 SumSize: 0 PktSize:
0 Min: 65536 Max: 0 Avg: 0,00
03: Messages: 1122954 SumSize: 19707008 PktSize:
348850444 Min: 16 Max: 24 Avg: 17,55
04: Messages: 701196 SumSize: 21061744 PktSize:
304651920 Min: 20 Max: 116 Avg: 30,04
82: Messages: 222101 SumSize: 54828760 PktSize:
153099912 Min: 52 Max: 908 Avg: 246,86
c9: Messages: 578624 SumSize: 50502516 PktSize:
224805188 Min: 44 Max: 124 Avg: 87,28
ca: Messages: 7969721 SumSize: 444490760 PktSize:
595315788 Min: 16 Max: 88 Avg: 55,77
Number of IP addresses per Packet/Message:
Packets: Min: 1 Max: 190 Avg: 30,69
U Packets: Min: 1 Max: 50 Avg: 12,40
Messages: Min: 1 Max: 48 Avg: 6,25
U Messages: Min: 1 Max: 24 Avg: 5,79
---
The Freifunk Augsburg dump mostly confirms these results and takes it to
the extreme even more. Calculations for the table size are simular and
yield results of 1/2 to 1.3.5 for compression.
Maybe a short note on the SumSize vs. PkgSize values: The SumSize is the
sum of all messages of this type. The PkgSize is the sum of all packets
containing at least one message of that type. The ratio of
SumSize/PkgSize can be used to estimate how much compression of a
certain type of messages is worth the effort.
Regards,
BenBE.
[1]
https://sourceforge.net/tracker/?func=detail&atid=469573&aid=3287379&group_id=53066
[2] http://www.olsr.org/docs/report.pdf
[3] http://en.wikipedia.org/wiki/Variable-length_quantity
EOF
Thanks for reading. What do you think? We'd love to implement this on
version 0.7 if any possible.
Best regards,
amadeus
More information about the Olsr-dev
mailing list