OLSR-NG
<google>OLSR</google>
Inhaltsverzeichnis
Goals
Our mission is simple. Build the most scalable and usable routing daemon routing wireless and fixed line segments. The routing daemon shall scale up to
- 10000 (10K) nodes and
- 20000 (20K) routes
running on low-cost hardware (200 Mhz RISC CPUs / 32MB of memory).
One of the main goals is to make OLSR more scalable in practice. 350px|right|Complexity for n=1000 nodes of different data structures in the Dijkstra shortest path (SPF) algorithm.
In the this picture you can see the different complexity graphs for the SPF under the assumption that every node has 10 edges . As you can see, the red line has O(n^2) complexity. This conforms to the current implementation of OLSR from www.olsr.org. OLSR-NG plans to reduce the complexity to the green or even the yellow level. This will allow the mesh network clouds to become larger by a factor ~ 1000 (on the routing layer / layer 3).
For achieving that we first want to
- fix the existing olsrd and add new data structures and algorithms.
- Once olsrd is running fast we focus on protocol issues like
- measuring better links metrics, like including the bandwidth (ETT)
- link-state db synchronization issues (rather then brute force retransmission).
All protocol extensions shall be documented as an internet-draft and submitted to the IETF MANET working group http://www.ietf.org/html.charters/manet-charter.html
Next we want to improve the management tools of olsrd like the
- http_info plugin or
- txt_info plugins or
- building a new XMLinfo plugin
such that that large clouds consisting of thousands of nodes can be troubleshooted in an effective way.
OLSR-NG is a open source project. Meaning everybody is invited to join in and help. We do have some bounties for the best solutions. If you want to participate, drop us an email: mailto:aaron@lo-res.org, mailto:hannes@gredler.at or mailto:bernd@firmix.at
Current Status
- olsrd 0.5.4 was released! Thx everybody a lot! Big credits go out to Hannes and Bernd.
- UML test server is being worked on. This will allow the B.A.T.M.A.N team to test their protocol and us to test our scalability ideas with 1000nd of olsr instances.
- Ongoing code cleanups
sponsor
200px|supported by IPA made possible by a grant from IPA. Thanks we really appreciate your help and your courage to support us!
Subprojects
SPF refactoring
LSDB refactoring
RIB refactoring
main links
Main OLSR-NG project blog: http://olsr.funkfeuer.at
Slides from the OLSR-NG kickoff presentation: http://outpost.funkfeuer.at/~aaron/olsr-ng.pdf
We communicate on the olsr-dev mailinglist: https://www.olsr.org/mailman/listinfo/olsr-dev . All commit messages can be seen on the olsr-cvs list
UML test server
current load and statistics: http://texas.funkfeuer.at
center|600px|topo map 1500 UML instances running in parallel. Note the packetloss! (check out the TopologyPics archive also)
topo map 1500 UML instances running in parallel. Note the packetloss!
We have already been running 2000 instances and there was still plenty of RAM left. So 1000 is a very safe bet. However according to the UML docu we can probably safely assume that we can scale up miuch higher because UML will only take the RAM that each instance actually needs. UML actually has other shortcomings: high CPU overhead, lots of context swiches. Trying to increase the performance at the moment...
current open todos UML server
Next important (*) things to do:
- DONE(aka) update texas's BIOS - FIXED
- add the packet loss tc rules (zethix already prepared it)
- create random netowkrs (easy)
- create network topologies based on a power law distribution ( a bit harder, but realistic for the internet)
- DONE(zethix) create scripts to find out which olsrd instances crashed
- create scripts to find out if a UML instance is not responsive anymore
- find better measurement tools . Look into sar
- DONE(aka) recompile host kernel and get rid of the "BUG: soft lockup detected on CPU#0!" messages
- DONE(aka) recompile host kernel and enable the preemtion patch
- DONE(zethix,aka) make hostfs so that developers can easily upload a new olsrd version to all uml instances. They should see the difference easily. Look into hostfs
- DONE(ake) increase performance of the UML simulator itself (decrease HZ, look into SKAS3 patch again, 32 bit recompile, talk with jeff etc)
- find more meaningful topology visualization tools (http://www.caida.org)
- add b.a.t.m.a.n to the root filesystem. (?)
- compare the scheduling / scalability of the test with OpenVZ and olsr_switch
User HOWTO
NOTE! You are root on the system. Effectively we need lots of sudo privs. So... use it wisely.
- log in
- make clean
- edit common.sh and adapt the parameters to your needs
#!/bin/sh # # VARS # MAX_INSTANCES=1500 ROOT_FS=root_fs NICELVL="-n 5" u=$USER #SINGLE=1
We supply you with a good working root filesystem (root_fs) so no need to change that. The SINGLE parameter just says that you want to start a single instance and be logged in (needed for debugging purposes)
- the UML instance can read files and programs from
$HOME/public_uml/share
This is where you can put your programs or your version of olsrd (and its libs) or the B.A.T.M.A.N. binaries.
N.B. This directory is shared between all UML instances that you will start in your simulation, so, they all have read-only access to it. It will appear inside each UML as /mnt/share/. There is also another, per-instance, read-write directory that you can use to save data for later analysis (e.g. redirect olsrd stdout to a file and print some debugging info there). This second directory will be under $HOME/public_uml/exp/<UML IP> (where UML IP is the ip address of each UML instance). It will also appear as /mnt/exp inside UML's environment.
- put your special rcS file into $HOME/public_uml/share/etc/init.d/ . This rcS file will be called from the UML instances /etc/init.d/rcS startup script. Starting olsrd etc must be done from this user supplied rcS. In case there is no user supplied rcS, then the standard olsrd with the standard settings of the root_fs (/etc/olsrd.conf) us started.
- make
This will start the simulation.
N.B. When the simulation is started, an olsrd instance is started on the host as well. You can use it if you need to interact with the olsrd network - for instance, topology maps are generated through this instance (see below).
- Issuing commands inside UML manually - the 'make' command creates a screen session for every UML process it creates, and redirects its input and output there. You can use screen to attach to a particular session. Use
screen -ls (as root)
to list all available sessions, and
screen -S blabla.10.0.x.y -d -RR
to attach to a session. This will give you shell access to the system.
N.B. All modifications to the root filesystem will be preserved only for the duration of the simulation! Once it is stopped, changes will be lost!
- observe the success on http://texas.funkfeuer.at or create a new topo map via ( cd /var/www/topo; ./doit.sh ). If you see a complete graph, then your version has little packetloss!
- stop it via
make clean
or
make stop
Please make sure (by looking at http://texas.funkfeuer.at) if you are the only person running a simulation at the moment!
Some things to note
- the topology visualisation scripts run with nice level +5
the UML instances with nicelevel +10 (see run.sh) -> Never ever go higher than nicelevel 0 because then you will disturb the system monitoring (munin) tools and we will not be able to see what the seimulation is doing.
Open questions/bug reports?
Who wants to contribute?
Who is willing to work on something | Contact info |
---|---|
Aaron Kaplan | mailto:aaron@lo-res.org |
Roman Steiner | mailto:roman.steiner@gmx.at |
Bernd Petrovitsch | mailto:bernd@firmix.at |
Andrej Rursev (zethix) | mailto:zethix@gmail.com |
Hannes Gredler | mailto:hannes@gredler.at |
Who is working on what?
Who | What | Status | Comments |
---|---|---|---|
Bernd Petrovitsch, Thomas Lopatic, Hannes Gredler | release 0.5 | DONE | |
??? | release 0.5 make packages for freifunk FW, DD-WRT, etc, windows (XP, Vista), ... and test them | OPEN | freifunk FW is done by Sven-Ola Tücke, .rpm and .deb by various people on olsr-dev@lists.olsr.org, Windows: ??? |
Aaron | analyze IP autoconfig mechanisms and find the best one | OPEN | |
Hannes Gredler | tcpdump parses olsr packets, | DONE | |
Hannes Gredler | SPF improvements | DONE | |
Hannes Gredler | reduce malloc thrashing during SPF computation | DONE | |
Hannes Gredler | improve post-SPF handling (route table conciliation, best path selection) | DONE | |
Bernd Petrovitsch | rework the logging/tracing/error reporting | WIP | |
Bernd Petrovitsch | rework the LQ-TC and LQ-HELLO input parsing, avoiding malloc thrashing | DONE | The output side can also be avoid malloc() and free(). Alas, the code is more complicated there. |
Hannes Gredler | spurious neighbor loss on nodes with high neighbor count | OPEN/investigating | |
Aaron Kaplan,Bernd Petrovitsch | olsr-ng test server | DONE | Well, the thing doesn't boot ATM. God knows why .... |
Aaron Kaplan | theory, complexity analysis. Goal: find the best complexity on the algorithmic side. | DONE | theory tells that fibonacci heaps are best, practise tells that an AVL tree as a minheap implementation fits the complexity of frequent re-keyings better |
Zethix, Aaron Kaplan | UML cluster setup | WIP, currently we can start around 2000 UML instances. But the uml_switch software still drops packets between virtual interfaces. http://www.openvz.org seems also like a promising solution | |
Aaron Kaplan, Hannes | draft. write a draft about LQ extensions | OPEN | |
Bernd Petrovitsch | Variuos Cleanup Mini- Projects | DONE/WIP | reworked floating point ops in src/mantissa.[ch] to minimize run-time impact, fixed dependencies, |
Sebastian Sauer | LinkQuality / metrics (e.g. ETX/ETT) improvements | OPEN/WIP (no code yet committed) | evaluate best current practice; |
Sebastian Sauer | FishEye improvements | OPEN/investigating | evaluate best current practice; |
Sebastian Sauer | effect of OLSR parameters on the mesh | OPEN/investigating | evaluate best current practice; spot and (maybe) eliminate dangers/instabilities |
Sebastian Sauer | selfish nodes / malicious nodes | OPEN/investigating | risk assessments |
<mm>flash</mm>
contact mailto:aaron@lo-res.org or Bernd if you are interested in participating!
Next Steps
- TU Wien lecture "Verteilte systeme", 20.4.2007 will present our ideas about optimizing complexity. Aaron also wants to adress more students from the TU to participate. DONE. Let's see if new participants want to join.
- finalize the UML test server
- try out the optimization ideas and document the speedup
- more cleanups
- olsrd is doing lots of malloc()s and free()s - use ltrace to see this.
- review malloc()/free() if it theys are superflous and can be implemented with buffers on the stack or just moving pointers around.
- are there very frequently malloc()ed and free()d struct? Perhaps a free list can help to avoid lots of malloc()/free() handling.
- we have several coding styles in there
- add wrappers to hide type casts for Windows (and perhaps others). Reserve some prefix (e.g. x is used for this often as in xmalloc(), olsr_ is IMHO quite long and there too many olsr_ perfixed types and functions right now.)
- fixup error reporting/tracing/logging
- add synchronization and make the daemon multi-threading (e.g. the bmf plugin uses it right now, the httpinfo plugin could benefit from such a thing)
- make the parameter parsing of the plugins more consistent (some are case-sensitive, some are not, most do not check syntax errors). Work in progress
- The incoming and outgoing packets are deserialized and serialized via pointers to packed structs. This is somewhat dangerous as other compilers or the same compielr for other architectures may or may not behave the same. And - worse - it misleads people to copy the same data various times around or play with pointers so no one can easily see ehat'e going on. I (Bernd) started with a more direct approach in src/lq_packet.c where we have one "unsigned char *" which walks sequentially through the incoming packet and gets the value with small inline functions into where one needs it later on - mostly some simple struct which is a normal C struct and used by the core code.
- 'net_outbuffer_push() memcpy()es the packet from the caller supplied buffer into another buffer. Well, that's one more copy operation for every outgoinf packet.
- ....
- olsrd is doing lots of malloc()s and free()s - use ltrace to see this.
Bounties
please take a look at the slides and get in contact with us directly at the moment!
Source code
- CVS repos:
(as user "ipo23" ) export CVS_RSH=ssh cvs -z3 -d:ext:ipo23@olsrd.cvs.sourceforge.net:/cvsroot/olsrd co -P olsrd-current as anonymous user) cvs -d:pserver:anonymous@olsrd.cvs.sourceforge.net:/cvsroot/olsrd login cvs -z3 -d:pserver:anonymous@olsrd.cvs.sourceforge.net:/cvsroot/olsrd co -P olsrd-current
Theory section
data structures
- Heap ... We need good heaps/priority queues for A*-Search / Dijkstra
- especially the Fibonacci Heap has a to my knowledge the very best asymptotic complexity of O(1) almost everywhere.
However, practice shows that... Currently as of 0.51pre we use a AVL tree which has complexity O(log(n)). Hannes tested the fibheap package from gcc and found out that in our networks (~ 200 nodes) the AVL tree heap implementation still beats the fibonacci heap by 60%.
fibonacci heap:
--- SPF-stats for 203 nodes, 335 routes (total/init/run/route/kern/cleanup):,, 237, --- SPF-stats for 203 nodes, 337 routes (total/init/run/route/kern/cleanup):,, 238, --- SPF-stats for 203 nodes, 337 routes (total/init/run/route/kern/cleanup):,, 238, --- SPF-stats for 203 nodes, 339 routes (total/init/run/route/kern/cleanup):,, 239, --- SPF-stats for 203 nodes, 339 routes (total/init/run/route/kern/cleanup):,, 238, --- SPF-stats for 203 nodes, 341 routes (total/init/run/route/kern/cleanup):,, 240, --- SPF-stats for 203 nodes, 341 routes (total/init/run/route/kern/cleanup):,, 236, --- SPF-stats for 203 nodes, 341 routes (total/init/run/route/kern/cleanup):,, 238, --- SPF-stats for 203 nodes, 341 routes (total/init/run/route/kern/cleanup):,, 238, --- SPF-stats for 203 nodes, 345 routes (total/init/run/route/kern/cleanup):,, 239, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 238, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 238, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 238, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 238, --- SPF-stats for 203 nodes, 347 routes (total/init/run/route/kern/cleanup):,, 238, AVL heap: --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 143, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 142, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 142, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 144, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 145, --- SPF-stats for 203 nodes, 347 routes (total/init/run/route/kern/cleanup):,, 145, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 142, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 142, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 144, --- SPF-stats for 203 nodes, 346 routes (total/init/run/route/kern/cleanup):,, 145, --- SPF-stats for 203 nodes, 347 routes (total/init/run/route/kern/cleanup):,, 145, --- SPF-stats for 202 nodes, 347 routes (total/init/run/route/kern/cleanup):,, 145, --- SPF-stats for 202 nodes, 347 routes (total/init/run/route/kern/cleanup):,, 142, --- SPF-stats for 202 nodes, 347 routes (total/init/run/route/kern/cleanup):,, 146,
The following complexities<ref> Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest (1990): Introduction to algorithms. MIT Press / McGraw-Hill. </ref> are worst-case for binary and binomial heaps and amortized complexity for Fibonacci heap. O(f) gives asymptotic upper bound and Θ(f) is asymptotically tight bound (see Wikipedia:Big O notation). Function names assume a min-heap.
Operation | Binary | Binomial | Fibonacci |
---|---|---|---|
createHeap | Θ(1) | Θ(1) | Θ(1) |
findMin | Θ(1) | O(lg n) or Θ(1) | Θ(1) |
deleteMin | Θ(lg n) | Θ(lg n) | O(lg n) |
insert | Θ(lg n) | O(lg n) | Θ(1) |
decreaseKey | Θ(lg n) | Θ(lg n) | Θ(1) |
merge | Θ(n) | O(lg n) | Θ(1) |
- Wikipedia:Data_Structures general overview. Good entry point for trees: Wikipedia:Binary_tree
- NIST Directory of Data Structures has a very extensive overview
- succinct datastructures (trees)
- succinct datastructures overview
- Tries
- sparse matrices
- look at kazlib from IS-IS ??
See also
Notes
<references/>
Links
Papers, Theory
- RFC-3626: the "OLSR RFC"
- Workshop at Hipercom Oct 2006
- OLSR-v2 Draft 01 at hipercom
- http://www.adhocsys.org/
AdHocSys is a two-year European project to provide reliable broadband services in rural and mountain regions. This objective will be achieved by means of the creation of a wireless ad hoc broadband network, with special enhancements to reliability and availability. The network consists of one or several gateways connecting to the global Internet and several intermediate nodes which provide multihop connections between the gateways and end users.
- WOSPF-OR Uni Oslo Wireless OSPF with Overlapping Relays
- W-OSPF INRA/Boing Wireless OSPF
- A Cross-Layer Admission Control Framework for Wireless Ad-Hoc Networks using Multiple Antennas, Bechir Hamdaoui and Parameswaran Ramanathan
misc
- Homepage: http://www.olsr.org/
- NATO C3 Agency (NC3A) Radio Protocols Lab https://elayne.nc3a.nato.int/
- commercial INRIA HIPERCOM spin-off http://www.luceor.com/
- commercial MIT Roofnet spin-off http://www.meraki.net/