UML test server: Unterschied zwischen den Versionen

Aus FunkFeuer Wiki
Wechseln zu: Navigation, Suche
(give UML server pages a new home)
 
K (Änderungen von Valera (Diskussion) rückgängig gemacht und letzte Version von Datacop wiederhergestellt)
 
(8 dazwischenliegende Versionen von 5 Benutzern werden nicht angezeigt)
Zeile 2: Zeile 2:
  
 
current load and statistics: http://texas.funkfeuer.at
 
current load and statistics: http://texas.funkfeuer.at
 +
 +
== List of cross dev instances / IP assignment ==
 +
 +
{| width="100%" border=0 class="events vcalendar"
 +
!align="left" width="150" | IP
 +
!align="left" | OS
 +
!align="left" | plattform
 +
!align="left" | screen session
 +
!align="left" | status
 +
|-
 +
|10.254.0.254
 +
| host system
 +
| (texas himself).
 +
| linux
 +
|<font color=green>OK</font>
 +
|-
 +
|10.254.0.1
 +
| openbsd
 +
| (qemu, i386).
 +
| obsd
 +
|<font color=red>NOT OK</font>
 +
|-
 +
|10.254.0.2
 +
| freebsd 6.2
 +
| (qemu, i386).
 +
| fbsd
 +
|<font color=red>NOT OK</font>
 +
|-
 +
|10.254.0.3
 +
| netbsd
 +
| (qemu, i386).
 +
| nbsd
 +
|<font color=red>NOT OK</font>
 +
|-
 +
|10.254.0.4
 +
| olpc XO (redhat fedora)
 +
| (qemu, i386).
 +
| xo
 +
|<font color=red>NOT OK</font>
 +
|-
 +
|10.254.0.5
 +
| windows XP SP2
 +
| (qemu, i386).
 +
| winxp
 +
|<font color=red>NOT OK</font>
 +
|}
 +
  
 
[[Bild:Texas.funkfeuer.at.png|right|300px|our UML server]]
 
[[Bild:Texas.funkfeuer.at.png|right|300px|our UML server]]
Zeile 12: Zeile 59:
 
We have already been running 2000 instances and there was still  plenty of RAM left. So 1000 is a very safe bet. However according to the UML docu we can probably safely assume that we can scale up miuch higher because UML will only take the RAM that each instance actually needs.  
 
We have already been running 2000 instances and there was still  plenty of RAM left. So 1000 is a very safe bet. However according to the UML docu we can probably safely assume that we can scale up miuch higher because UML will only take the RAM that each instance actually needs.  
 
UML actually has other shortcomings: high CPU overhead, lots of context swiches. Trying to increase the performance at the moment...
 
UML actually has other shortcomings: high CPU overhead, lots of context swiches. Trying to increase the performance at the moment...
 
  
 
== current open todos UML server ==
 
== current open todos UML server ==

Aktuelle Version vom 4. März 2012, 01:55 Uhr

UML test server

current load and statistics: http://texas.funkfeuer.at

List of cross dev instances / IP assignment

IP OS plattform screen session status
10.254.0.254 host system (texas himself). linux OK
10.254.0.1 openbsd (qemu, i386). obsd NOT OK
10.254.0.2 freebsd 6.2 (qemu, i386). fbsd NOT OK
10.254.0.3 netbsd (qemu, i386). nbsd NOT OK
10.254.0.4 olpc XO (redhat fedora) (qemu, i386). xo NOT OK
10.254.0.5 windows XP SP2 (qemu, i386). winxp NOT OK


right|300px|our UML server

center|600px|topo map 1500 UML instances running in parallel. Note the packetloss! (check out the TopologyPics archive also)

topo map 1500 UML instances running in parallel. Note the packetloss!

We have already been running 2000 instances and there was still plenty of RAM left. So 1000 is a very safe bet. However according to the UML docu we can probably safely assume that we can scale up miuch higher because UML will only take the RAM that each instance actually needs. UML actually has other shortcomings: high CPU overhead, lots of context swiches. Trying to increase the performance at the moment...

current open todos UML server

Next important (*) things to do:

  • DONE(aka) update texas's BIOS - FIXED
  • add the packet loss tc rules (zethix already prepared it)
  • create random netowkrs (easy)
  • create network topologies based on a power law distribution ( a bit harder, but realistic for the internet)
  • DONE(zethix) create scripts to find out which olsrd instances crashed
  • create scripts to find out if a UML instance is not responsive anymore
  • find better measurement tools . Look into sar
  • DONE(aka) recompile host kernel and get rid of the "BUG: soft lockup detected on CPU#0!" messages
  • DONE(aka) recompile host kernel and enable the preemtion patch
  • DONE(zethix,aka) make hostfs so that developers can easily upload a new olsrd version to all uml instances. They should see the difference easily. Look into hostfs
  • DONE(ake) increase performance of the UML simulator itself (decrease HZ, look into SKAS3 patch again, 32 bit recompile, talk with jeff etc)
  • find more meaningful topology visualization tools (http://www.caida.org)
  • add b.a.t.m.a.n to the root filesystem. (?)
  • compare the scheduling / scalability of the test with OpenVZ and olsr_switch

User HOWTO

 NOTE! You are root on the system. Effectively we need lots of sudo privs. So... use it wisely.
  1. log in
  2. make clean
  3. edit common.sh and adapt the parameters to your needs
 #!/bin/sh
 #
 # VARS
 #
 MAX_INSTANCES=1500
 ROOT_FS=root_fs
 NICELVL="-n 5"
 u=$USER
 #SINGLE=1

We supply you with a good working root filesystem (root_fs) so no need to change that. The SINGLE parameter just says that you want to start a single instance and be logged in (needed for debugging purposes)

  1. the UML instance can read files and programs from
 $HOME/public_uml/share

This is where you can put your programs or your version of olsrd (and its libs) or the B.A.T.M.A.N. binaries.

 N.B. This directory is shared between all UML instances that you will 
 start in your simulation, so, they all have read-only access to it. 
 It will appear inside each UML as /mnt/share/. There is also another, 
 per-instance, read-write directory that you can use to save data for 
 later analysis (e.g. redirect olsrd stdout to a file and print some 
 debugging info there). This second directory will be under 
 $HOME/public_uml/exp/<UML IP> (where UML IP is the ip address of each 
 UML instance). It will also appear as /mnt/exp inside UML's environment.
  1. put your special rcS file into $HOME/public_uml/share/etc/init.d/ . This rcS file will be called from the UML instances /etc/init.d/rcS startup script. Starting olsrd etc must be done from this user supplied rcS. In case there is no user supplied rcS, then the standard olsrd with the standard settings of the root_fs (/etc/olsrd.conf) us started.
  1. make

This will start the simulation.

 N.B. When the simulation is started, an olsrd instance is started on 
 the host as well. You can use it if you need to interact with the 
 olsrd network - for instance, topology maps are generated through this
 instance (see below). 
  1. Issuing commands inside UML manually - the 'make' command creates a screen session for every UML process it creates, and redirects its input and output there. You can use screen to attach to a particular session. Use
 screen -ls              (as root)

to list all available sessions, and

 screen -S blabla.10.0.x.y -d -RR

to attach to a session. This will give you shell access to the system.

 N.B. All modifications to the root filesystem will be preserved only 
 for the duration of the simulation! Once it is stopped, changes will 
 be lost!
  1. observe the success on http://texas.funkfeuer.at or create a new topo map via ( cd /var/www/topo; ./doit.sh ). If you see a complete graph, then your version has little packetloss!
  1. stop it via
 make clean 

or

 make stop

Please make sure (by looking at http://texas.funkfeuer.at) if you are the only person running a simulation at the moment!

Some things to note

  • the topology visualisation scripts run with nice level +5

the UML instances with nicelevel +10 (see run.sh) -> Never ever go higher than nicelevel 0 because then you will disturb the system monitoring (munin) tools and we will not be able to see what the seimulation is doing.

Open questions/bug reports?