[Prev] [Next] [TOC] [Chapters]

11 Running Simulations

11.1 Introduction

This chapter describes how to run simulations. It covers basic usage, user interfaces, running simulation campaigns, and many other topics.

11.2 Simulation Executables vs Libraries

As we have seen in the Build chapter, simulations may be compiled to an executable or to a shared library. When the build output is an executable, it can be run directly. For example, the Fifo example simulation can be run with the following command:

$ ./fifo

Simulations compiled to a shared library can be run using the opp_run program. For example, if we compiled the Fifo simulation to a shared library on Linux, the build output would be a libfifo.so file that could be run with the following command:

$ opp_run -l fifo

The -l option tells opp_run to load the given shared library. The -l option will be covered in detail in section [11.9].

11.3 Command-Line Options

The above commands illustrate just the simplest case. Usually you will need to add extra command-line options, for example to specify what ini file(s) to use, which configuration to run, which user interface to activate, where to load NED files from, and so on. The rest of this chapter will cover these options.

To get a complete list of command line options accepted by simulations, run the opp_run program (or any other simulation executable) with -h:

$ opp_run -h

Or:

$ ./fifo -h

11.4 Configuration Options on the Command Line

Configuration options can also be specified on the command line, not only in ini files. To do so, prefix the option name with a double dash, and append the value with an equal sign. Be sure not to have spaces around the equal sign. If the value contains spaces or shell metacharacters, you'll need to protect the value (or the whole option) with quotes or apostrophes.

Example:

$ ./fifo --debug-on-errors=true

In case an option is specified both on the command line and in an ini file, the command line takes precedence.

To get the list of all possible configuration options, use the -h config option. (The additional -s option below just makes the output less verbose.)

$ opp_run -s -h config
Supported configuration options:
  **.bin-recording=<bool>, default:true; per-object setting
  check-signals=<bool>, default:true; per-run setting
  cmdenv-autoflush=<bool>, default:false; per-run setting
  cmdenv-config-name=<string>; global setting
  ...

To see the option descriptions as well, use -h configdetails.

$ opp_run -h configdetails

11.5 Specifying Ini Files

The default ini file is omnetpp.ini, and is loaded if no other ini file is given on the command line.

Ini files can be specified both as plain arguments and with the -f option, so the following two commands are equivalent:

$ ./fifo experiment.ini common.ini
$ ./fifo -f experiment.ini -f common.ini

Multiple ini files can be given, and their contents will be merged. This allows for partitioning the configuration into separate files, for example to simulation options, module parameters and result recording options.

11.6 Specifying the NED Path

NED files are loaded from directories listed on the NED path. More precisely, they are loaded from the listed directories and their whole subdirectory trees. Directories are separated with a semicolon (;).

The NED path can be specified in several ways:

NED path resolution rules are as follows:

  1. OMNeT++ checks for NED path specified on the command line with the -n option
  2. If not found on the command line, it checks for the NEDPATH environment variable
  3. The ned-path option value from the ini file is appended to the result of the above steps
  4. If the result is still empty, it falls back to "." (the current directory)

11.7 Selecting a User Interface

OMNeT++ simulations can be run under different user interfaces a.k.a. runtime environments. Currently the following user interfaces are supported:

You would typically test and debug your simulation under Tkenv or Qtenv, then run actual simulation experiments from the command line or shell script, using Cmdenv. Tkenv and Qtenv are also better suited for educational and demonstration purposes.

User interfaces are provided in the form of libraries that can be linked with statically, dynamically, or can be loaded at runtime.

When several user interface libraries are available in a simulation program, the user can select via command-line or ini file options which one to use. In the absence of such an option, the one with the highest priority will be started. Currently priorities are set such that Qtenv has the highest priority, then Tkenv, and finally Cmdenv. By default, simulations are linked with all available user interfaces, but this can be controlled via opp_makemake options or in the OMNeT++ global build configuration as well. The user interfaces available in a simulation program can be listed by running it the -h userinterfaces option.

You can explicitly select a user interface on the command line with the -u option (specify Qtenv, Tkenv or Cmdenv as its argument), or by adding the user-interface option to the configuration. If both the config option and the command line option are present, the command line option takes precedence.

Since the graphical interfaces are the default (have higher priority), the most common use of the -u option is to select Cmdenv, e.g. for batch execution. The following example performs all runs of the Aloha example simulation using Cmdenv:

$ ./aloha -c PureAlohaExperiment -u Cmdenv

11.8 Selecting Configurations and Runs

All user interfaces support the -c <configname> and -r <runfilter> options for selecting which simulation(s) to run.

The -c option expects the name of an ini file configuration as an argument. The -r option may be needed when the configuration expands to multiple simulation runs. That is the case when the configuration defines a parameter study (see section [10.4]), or when it contains a repeat configuration option that prescribes multiple repetitions with different RNG seeds (see section [10.4.6]). The -r option can then be used to select a subset of all runs (or one specific run, for that matter). A missing -r option selects all runs in the given configuration.

It depends on the particular user interface how it interprets the -c and -r options. Cmdenv performs all selected simulation runs (optionally stopping after the first one that finishes with an error). GUI interfaces like Qtenv and Tkenv may use this information to fill the run selection dialog (or to set up the simulation automatically if there is only one matching run.)

11.8.1 Run Filter Syntax

The run filter accepts two syntaxes: a comma-separated list of run numbers or run number ranges (for example 1,2,5-10), or an arithmetic expression. The arithmetic expression is similar to constraint expressions in the configuration (see section [10.4.5]). It may refer to iteration variables and to the repeat counter with the dollar syntax: $numHosts, $repetition. An example: $numHosts>10 && $mean==2.

Note that due to the presence of the dollar sign (and spaces), the expression should be protected against shell expansion, e.g. using apostrophes:

$ ./aloha -c PureAlohaExperiment -r '$numHosts>10 && $mean<2'

Or, with double quotes:

$ ./aloha -c PureAlohaExperiment -r "\$numHosts>10 && \$mean<2"

11.8.2 The Query Option

The -q (query) option complements -c and -r, and allows one to list the runs matched by the run filter. -q expects an argument that defines the format and verbosity of the output. Several formats are available: numruns, runnumbers, runs, rundetails, runconfig. Use opp_run -h to get a complete list.

-q runs prints one line of information with the iteration variables about each run that the run filter matches. An example:

$ ./aloha -s -c PureAlohaExperiment -r '$numHosts>10 && $mean<2' -q runs
Run 14: $numHosts=15, $mean=1, $repetition=0
Run 15: $numHosts=15, $mean=1, $repetition=1
Run 28: $numHosts=20, $mean=1, $repetition=0
Run 29: $numHosts=20, $mean=1, $repetition=1

The -s option just makes the output less verbose.

If you need more information, use -q rundetails or -q runconfig. rundetails is like numruns, but it also prints the values of the iteration variables and a summary of the configuration (the expanded values of configuration entries that contain iteration variables) for each matching run:

$ ./aloha -s -c PureAlohaExperiment -r '$numHosts>10 && $mean<2' -q rundetails
Run 14: $numHosts=15, $mean=1, $repetition=0
    Aloha.numHosts = 15
    Aloha.host[*].iaTime = exponential(1s)

Run 15: $numHosts=15, $mean=1, $repetition=1
    Aloha.numHosts = 15
    Aloha.host[*].iaTime = exponential(1s)
...

The numruns and runnumbers formats are mainly intended for use in scripts. They just print the number of matching runs and the plain run number list, respectively.

$ ./aloha -s -c PureAlohaExperiment -r '$numHosts>10 && $mean<2' -q numruns
4
$ ./aloha -s -c PureAlohaExperiment -r '$numHosts>10 && $mean<2' -q runnumbers
 14 15 28 29

The -q option encapsulates some unrelated functionality, as well: -q sectioninheritance ignores -r, and prints the inheritance chain of the inifile sections (the inheritance graph after linearization) for the configuration denoted by -c.

11.9 Loading Extra Libraries

OMNeT++ allows you to load shared libraries at runtime. These shared libraries may contain model code (e.g. simple module implementation classes), dynamically registered classes that extend the simulator's functionality (for example NED functions, result filters/recorders, figures types, schedulers, output vector/scalar writers, Qtenv inspectors, or even custom user interfaces), or other code.

Libraries can be specified with the -l <libraryname> command line option (there can be several -l's on the command line), or with the load-libs configuration option. The values from the command line and the config file will be merged.

The prefix and suffix from the library name can be omitted (the extensions .dll, .so, .dylib, and also the common lib prefix on Unix systems). This means that you can specify the library name in a platform independent way: if you specify -l foo, then OMNeT++ will look for foo.dll, libfoo.dll, libfoo.so or libfoo.dylib, depending on the platform.

OMNeT++ will use the dlopen() or LoadLibrary() system call to load the library. To ensure that the system call finds the file, either specify the library name with a full path (pre- and postfixes of the library file name still can be omitted), or adjust the shared library path environment variable of your OS: PATH on Windows, LD_LIBRARY_PATH on Unix, and DYLD_LIBRARY_PATH on Mac OS X.

11.10 Stopping Condition

The most common way of specifying when to finish the simulation is to set a time limit. There are several time limits that can be set with the following configuration options:

An example:

$ ./fifo --sim-time-limit=500s

If several time limits are set together, the simulation will stop when the first one is hit.

If needed, the simulation may also be stopped programmatically, for example when results of a (steady-state) simulation have reached the desired accuracy. This can be done by calling the endSimulation() method.

11.11 Controlling the Output

The following options can be used to enable/disable the creation of various output files during simulation.

These configuration options, like any other, can be specified both in ini files and on the command line. An example:

$ ./fifo --record-eventlog=true --scalar-recording=false --vector-recording=false

11.12 Debugging

Debugging is a task that comes up often during model development. The following configuration options are related to C++ debugging:

An example that launches the simulation under the gdb debugger:

$ gdb --args ./aloha --debug-on-errors=true

11.13 Debugging Leaked Messages

The most common cause of memory leaks in OMNeT++ simulations is forgetting to delete messages. When this happens, the simulation process will continually grow in size as the simulation progresses, and when left to run long enough, it will eventually cause an out-of-memory condition.

Luckily, this problem is easy to indentify, as all user interfaces display the number of message objects currently in the system. Take a look at the following example Cmdenv output:

...
** Event #1908736   t=58914.051870113485   Elapsed: 2.000s (0m 02s)
     Speed:     ev/sec=954368   simsec/sec=29457   ev/simsec=32.3987
     Messages:  created: 561611   `\tbf{present:\ 21}`   in FES: 34
** Event #3433472   t=106067.401570204991   Elapsed: 4.000s (0m 04s)
     Speed:     ev/sec=762368   simsec/sec=23576.7   ev/simsec=32.3357
     Messages:  created: 1010142   `\tbf{present:\ 354}`   in FES: 27
** Event #5338880   t=165025.763387178965   Elapsed: 6.000s (0m 06s)
     Speed:     ev/sec=952704   simsec/sec=29479.2   ev/simsec=32.3179
     Messages:  created: 1570675   `\tbf{present:\ 596}`   in FES: 21
** Event #6850304   t=211763.433233042017   Elapsed: 8.000s (0m 08s)
     Speed:     ev/sec=755712   simsec/sec=23368.8   ev/simsec=32.3385
     Messages:  created: 2015318   `\tbf{present:\ 732}`   in FES: 38
** Event #8753920   t=270587.781554343184   Elapsed: 10.000s (0m 10s)
     Speed:     ev/sec=951808   simsec/sec=29412.2   ev/simsec=32.361
     Messages:  created: 2575634   `\tbf{present:\ 937}`   in FES: 32 
** Event #10270208   t=317495.244698246477   Elapsed: 12.000s (0m 12s)
     Speed:     ev/sec=758144   simsec/sec=23453.7   ev/simsec=32.3251
     Messages:  created: 3021646   `\tbf{present:\ 1213}`   in FES: 20
...

The interesting parts are in bold font. The steadily increasing numbers are an indication that the simulation model, i.e. one or more modules in it, are missing some delete msg calls. It is best to use Qtenv or Tkenv to narrow down the issue to specific modules and/or message types.

Qtenv and Tkenv are also able to display the number of messages currently in the simulation. The numbers are displayed on the status bar. If you find that the number of messages is steadily increasing, you need to find where the message objects are located. This can be done with the help of the Find/Inspect Objects dialog.

If the number of messages is stable, it is still possible that the simulation is leaking other cObject-based objects; they can also be found using the Find/Inspect Objects dialog.

If the simulation is leaking non-OMNeT++ objects (i.e. not something derived from cObject) or other memory blocks, Cmdenv, Tkenv or Qtenv cannot help in tracking down the issue.

11.14 Debugging Other Memory Problems

Technically, memory leaks are only a subset of problems associated with memory allocations, i.e. the usage of new and delete in C++.

There are specialized tools that can help in tracking down memory allocation problems (memory leak, double-deletion, referencing deleted blocks, etc). Some of these tools are listed below.

11.15 Profiling

When a simulation runs correctly but is too slow, you might want to profile it. Profiling basically means collecting runtime information about how much time is spent at various parts of the program, in order to find places where optimizing the code would have the most impact.

However, there are a few other options you can try before resorting to profiling and optimizing. First, verify that it is the simulation itself that is slow. Make sure features like eventlog recording is not accidentally turned on. Run the simulation under Cmdenv to eliminate any possible overhead from Qtenv/Tkenv. If you must run the simulation under Qtenv/Tkenv, you can still gain speed by disabling animation features, closing all inspectors, hiding UI elements like the timeline, and so on.

Also, compile your code in release mode (with make MODE=release, see [9.2.3]) instead of debug. That can make a huge difference, especially with heavily templated code.

Some profiling software:

11.16 Checkpointing

Debugging long-running simulations can be challenging, because one often needs to run the simulation for a long time just to get to the point of failure and be able to start debugging.

Checkpointing can facilitate debugging such errors. It is a technique that basically consists of saving a snapshot of the application's state, and being able to resume execution from there, even multiple times. OMNeT++ itself contains no checkpointing functionality, but it is available via external tools. It depends on the tool whether it is able to restore GUI windows (usually not.)

Some checkpointing software that is available on Linux:

11.17 Using Cmdenv

Cmdenv is a lightweight, command line user interface that compiles and runs on all platforms. Cmdenv is designed primarily for batch execution.

Cmdenv simply executes some or all simulation runs that are described in the configuration file. The runs to be executed can be passed via command-line arguments or configuration options.

Cmdenv runs simulations in the same process. This means that e.g. if one simulation run writes a global variable, subsequent runs will also see the change. This is one reason why global variables in models are strongly discouraged.

11.17.1 Sample Output

When you run the Fifo example under Cmdenv, you should see something like this:

$ ./fifo -u Cmdenv -c Fifo1

OMNeT++ Discrete Event Simulation  (C) 1992-2017 Andras Varga, OpenSim Ltd.
Version: 5.0, edition: Academic Public License -- NOT FOR COMMERCIAL USE
See the license for distribution terms and warranty disclaimer
Setting up Cmdenv...
Loading NED files from .: 5

Preparing for running configuration Fifo1, run #0...
Scenario: $repetition=0
Assigned runID=Fifo1-0-20090104-12:23:25-5792
Setting up network 'FifoNet'...
Initializing...
Initializing module FifoNet, stage 0
Initializing module FifoNet.gen, stage 0
Initializing module FifoNet.fifo, stage 0
Initializing module FifoNet.sink, stage 0

Running simulation...
** Event #1   t=0   Elapsed: 0.000s (0m 00s)  0% completed
     Speed:     ev/sec=0   simsec/sec=0   ev/simsec=0
     Messages:  created: 2   present: 2   in FES: 1
** Event #232448   t=11719.051014922336   Elapsed: 2.003s (0m 02s)  3% completed
     Speed:     ev/sec=116050   simsec/sec=5850.75   ev/simsec=19.8351
     Messages:  created: 58114   present: 3   in FES: 2
...
** Event #7206882   t=360000.52066583684   Elapsed: 78.282s (1m 18s)  100% completed
     Speed:     ev/sec=118860   simsec/sec=5911.9   ev/simsec=20.1053
     Messages:  created: 1801723   present: 3   in FES: 2

<!> Simulation time limit reached -- simulation stopped.

Calling finish() at end of Run #0...
End.

As Cmdenv runs the simulation, it periodically prints the sequence number of the current event, the simulation time, the elapsed (real) time, and the performance of the simulation (how many events are processed per second; the first two values are 0 because there wasn't enough data for it to calculate yet). At the end of the simulation, the finish() methods of the simple modules are run, and the outputs from them are displayed.

11.17.2 Selecting Runs, Batch Operation

The most important command-line options for Cmdenv are -c and -r for selecting which simulations to perform. (They were described in section [11.8].) They also have their equivalent configuration options that can be written in files as well: cmdenv-config-name and cmdenv-runs-to-execute.

Another configuration option, cmdenv-stop-batch-on-error controls Cmdenv's behavior when performing multiple runs: it determines whether Cmdenv should stop after the first run that finishes with an error. By default, it does.

When performing multiple runs, Cmdenv prints run statistics at the end. Example output:

$ ./aloha -c PureAlohaExperiment -u Cmdenv
...
Run statistics: total 42, successful 30, errors 1, skipped 11

11.17.3 Express Mode

Cmdenv can execute simulations in two modes:

The default mode is Express. To turn off Express mode, specify false for the cmdenv-express-mode configuration option:

$ ./fifo -u Cmdenv -c Fifo1 --cmdenv-express-mode=false

There are several other options that also affect Express-mode and Normal mode behavior:

See Appendix [26] for more information about these options.

11.17.3.1 Interpreting Express-Mode Output

When the simulation is running in Express mode with detailed performance display enabled (cmdenv-performance-display=true), Cmdenv periodically outputs a three-line status report about the progress of the simulation. The output looks like this:

...
** Event #250000   t=123.74354 ( 2m  3s)    Elapsed: 0m 12s
     Speed:     ev/sec=19731.6   simsec/sec=9.80713   ev/simsec=2011.97
     Messages:  created: 55532   present: 6553   in FES: 8
** Event #300000   t=148.55496 ( 2m 28s)    Elapsed: 0m 15s
     Speed:     ev/sec=19584.8   simsec/sec=9.64698   ev/simsec=2030.15
     Messages:  created: 66605   present: 7815   in FES: 7
...

The first line of the status display (beginning with **) contains:

The second line displays simulation performance metrics:

The third line displays the number of messages, and it is important because it may indicate the “health” of your simulation.

The second value, the number of messages present, is more useful than perhaps one would initially think. It can be an indicator of the “health” of the simulation; if it is growing steadily, then either you have a memory leak and losing messages (which indicates a programming error), or the network you simulate is overloaded and queues are steadily filling up (which might indicate wrong input parameters).

Of course, if the number of messages does not increase, it does not mean that you do not have a memory leak (other memory leaks are also possible). Nevertheless the value is still useful, because by far the most common way of leaking memory in a simulation is by not deleting messages.

11.17.4 Other Options

Cmdenv has more configuration options than mentioned in this section; see the options beginning with cmdenv- in Appendix [26] for the complete list.

11.18 The Qtenv Graphical User Interface

Qtenv is a runtime simulation GUI. Qtenv supports interactive simulation execution, animation, tracing and debugging. Qtenv is recommended in the development stage of a simulation and for presentation purposes, since it allows one to get a detailed picture of the state of simulation at any point of execution and to follow what happens inside the network. Note that 3D visualization support and smooth animation support are only available in Qtenv. As of version OMNeT++ 5.1, Qtenv is the default user interface, and Tkenv is in maintenance mode.

11.18.1 Command-Line and Configuration Options

Simulations run under Qtenv accept all general command line and configuration options, including -c and -r. The configuration options specific to Qtenv include:

Qtenv is also affected by the following option:

See Appendix [26] for the list of possible configuration options.

11.19 The Tkenv Graphical User Interface

Tkenv is the traditional Tcl/Tk-based graphical runtime user interface. Tkenv supports interactive simulation execution, tracing and debugging. Tkenv is recommended in the development stage of a simulation and for presentation purposes, since it allows one to get a detailed picture of the state of simulation at any point of execution and to follow what happens inside the network.

11.19.1 Command-Line and Configuration Options

Simulations run under Tkenv accept all general command line and configuration options, including -c and -r. The configuration options specific to Tkenv include:

Tkenv is also affected by the following option:

See Appendix [26] for the list of possible configuration options.

11.20 Running Simulation Campaigns

Once your model works reliably, you will usually want to run several simulations, either to explore the parameter space via a parameter study (see section [10.4]), or to do multiple repetitions with different RNG seeds to increase the statistical accuracy of the results (see section [10.4.6]).

In this section, we will explore several ways to run batches of simulations efficently.

11.20.1 The Naive Approach

Assume that you want to run the parameter study in the Aloha example simulation for the numHosts>15 cases.

The first idea is that Cmdenv is capable of running simulation batches. The following command will do the job:

$ ./aloha -u Cmdenv -c PureAlohaExperiment -r '$numHosts>15'
...
Run statistics: total 14, successful 14
End.

It works fine. However, this approach has some drawbacks which becomes apparent when running hundreds or thousands of simulation runs.

  1. It uses only one CPU. In the age of multi-core CPUs, this is not very efficient.
  2. More prone to C++ programming errors in the model. A failure in a single run may abort execution (segfault) or corrupt the process state, possibly invalidating the results of subsequent runs.

To address the second drawback, we can execute each simulation run in its own Cmdenv instance.

$ ./aloha -c PureAlohaExperiment -r '$numHosts>15' -s -q runnumbers
28 29 30 31 32 33 34 35 36 37 38 39 40 41
$ ./aloha -u Cmdenv -c PureAlohaExperiment -r 28
$ ./aloha -u Cmdenv -c PureAlohaExperiment -r 29
$ ./aloha -u Cmdenv -c PureAlohaExperiment -r 30
...
$ ./aloha -u Cmdenv -c PureAlohaExperiment -r 41

It's a lot of commands to issue manually, but luckily it can be automated with a shell script like this:

#! /bin/sh
RUNS=$(./aloha -c PureAlohaExperiment -r '$numHosts>15' -s -q runnumbers)
for i in $RUNS; do
    ./aloha -u Cmdenv -c PureAlohaExperiment -r $i
done

Save the above into a text file called e.g. runAloha. Then give it executable permission, and run it:

$ chmod +x runAloha
$ ./runAloha

It will execute the simulations one-by-one, each in its own Cmdenv instance.

This approach involves a process start overhead for each simulation. Normally, this overhead is small compared to the time spent simulating. However, it may become more of a problem when running a large number of very short simulations (<<1s in CPU time). This effect may be mitigated by letting Cmdenv do several (e.g. 10) simulations in one go.

And then, the script still uses only one CPU. It would be better to keep all CPUs busy. For example, if you have 8 CPUs, there should be eight processes running all the time -- when one terminates, another would be launched in its place. You might notice that this behavior is similar to what GNU Make's -j<numJobs> option does. The opp_runall utility, to be covered in the next section, exploits GNU Make to schedule the running of simulations on multiple CPUs.

11.20.2 Using opp_runall

OMNeT++ has a utility program called opp_runall, which allows you to execute simulations using multiple CPUs and multiple processes.

opp_runall groups simulation runs into batches. Every batch corresponds to a Cmdenv process, that is, runs of a batch execute sequentially inside the same Cmdenv process. Batches (i.e. Cmdenv instances) are scheduled for running so that they keep all CPUs busy. The batch size as well as the number of CPUs to use have sensible defaults but can be overridden.

11.20.2.1 Command Line

opp_runall expects the normal simulation command in its argument list. The first positional (non-option) argument and all following arguments are treated as the simulation command (simulation program and its arguments).

Thus, to modify a normal Cmdenv simulation command to make use of multiple CPUs, simply prefix it with opp_runall:

$ opp_runall ./aloha -u Cmdenv -c PureAlohaExperiment -r '$numHosts>15'

Options intended for opp_runall should come before the the simulation command. These options include -b<N> for specifying the batch size, and -j<N> to specify the number of CPUs to use.

$ opp_runall -j8 -b4 ./aloha -u Cmdenv -c PureAlohaExperiment -r '$numHosts>15'

11.20.2.2 How It Works

First, opp_runall invokes the simulation command with extra command arguments (-s -q runnumbers) to figure out the list of runs it needs to perform, and groups the run numbers into batches. Then it exploits GNU make and its -j<N> option to do the heavy lifting. Namely, it generates a temporary makefile that allows make to run batches in parallel, and invokes make with the appropriate -j option. It is also possible to export the makefile for inspection and/or running it manually.

To illustrate the above, here is the content of such a makefile:

#
# This makefile was generated with the following command:
# opp_runall -j2 -b4 -e tmp ./aloha -u Cmdenv -c PureAlohaExperiment -r $numHosts>15 
#

SIMULATIONCMD = ./aloha -u Cmdenv -c PureAlohaExperiment -s \ 
                --cmdenv-redirect-output=true
TARGETS =  batch0 batch1 batch2 batch3

.PHONY: $(TARGETS)

all: $(TARGETS)
    @echo All runs completed.

batch0:
    $(SIMULATIONCMD) -r 28,29,30,31

batch1:
    $(SIMULATIONCMD) -r 32,33,34,35

batch2:
    $(SIMULATIONCMD) -r 36,37,38,39

batch3:
    $(SIMULATIONCMD) -r 40,41

11.20.3 Exploiting Clusters

With large scale simulations, using one's own desktop computer might not be enough. The solution could be to run the simulation on remote machines, that is, to employ a computing cluster.

In simple setups, cross-mounting the file system that contains OMNeT++ and the model, and using ssh to run the simulations might already provide a good solution.

In other cases, submitting simulation jobs and harvesting the results might be done via batch-queuing, cluster computing or grid computing middleware. The following list contains some pointers to such software:

11.21 Akaroa Support: Multiple Replications in Parallel

11.21.1 Introduction

Typical simulations are Monte-Carlo simulations: they use (pseudo-)random numbers to drive the simulation model. For the simulation to produce statistically reliable results, one has to carefully consider the following:

Neither question is trivial to answer. One might just suggest to wait “very long” or “long enough”. However, this is neither simple (how do you know what is “long enough”?) nor practical (even with today's high speed processors simulations of modest complexity can take hours, and one may not afford multiplying runtimes by, say, 10, “just to be safe.”) If you need further convincing, please read [Pawlikowsky02] and be horrified.

A possible solution is to look at the statistics while the simulation is running, and decide at runtime when enough data have been collected for the results to have reached the required accuracy. One possible criterion is given by the confidence level, more precisely, by its width relative to the mean. But ex ante it is unknown how many observations have to be collected to achieve this level -- it must be determined at runtime.

11.21.2 What Is Akaroa

Akaroa [Akaroa99] addresses the above problem. According to its authors, Akaroa (Akaroa2) is a “fully automated simulation tool designed for running distributed stochastic simulations in MRIP scenario” in a cluster computing environment.

MRIP stands for Multiple Replications in Parallel. In MRIP, the computers of the cluster run independent replications of the whole simulation process (i.e. with the same parameters but different seed for the RNGs (random number generators)), generating statistically equivalent streams of simulation output data. These data streams are fed to a global data analyser responsible for analysis of the final results and for stopping the simulation when the results reach a satisfactory accuracy.

The independent simulation processes run independently of one another and continuously send their observations to the central analyser and control process. This process combines the independent data streams, and calculates from these observations an overall estimate of the mean value of each parameter. Akaroa2 decides by a given confidence level and precision whether it has enough observations or not. When it judges that is has enough observations it halts the simulation.

If n processors are used, the needed simulation execution time is usually n times smaller compared to a one-processor simulation (the required number of observations are produced sooner). Thus, the simulation would be sped up approximately in proportion to the number of processors used and sometimes even more.

Akaroa was designed at the University of Canterbury in Christchurch, New Zealand and can be used free of charge for teaching and non-profit research activities.

11.21.3 Using Akaroa with OMNeT++

11.21.3.1 Starting Akaroa

Before the simulation can be run in parallel under Akaroa, you have to start up the system:

Each akslave establishes a connection with the akmaster.

Then you use akrun to start a simulation. akrun waits for the simulation to complete, and writes a report of the results to the standard output. The basic usage of the akrun command is:

$ akrun -n num_hosts command [argument..]

where command is the name of the simulation you want to start. Parameters for Akaroa are read from the file named Akaroa in the working directory. Collected data from the processes are sent to the akmaster process, and when the required precision has been reached, akmaster tells the simulation processes to terminate. The results are written to the standard output.

The above description is not detailed enough to help you set up and successfully use Akaroa -- for that you need to read the Akaroa manual.

11.21.3.2 Configuring OMNeT++ for Akaroa

First of all, you have to compile OMNeT++ with Akaroa support enabled.

The OMNeT++ simulation must be configured in omnetpp.ini so that it passes the observations to Akaroa. The simulation model itself does not need to be changed -- it continues to write the observations into output vectors (cOutVector objects, see chapter [7]). You can place some of the output vectors under Akaroa control.

You need to add the following to omnetpp.ini:

[General]
rng-class = "cAkaroaRNG"
outputvectormanager-class = "cAkOutputVectorManager"

These lines cause the simulation to obtain random numbers from Akaroa, and allows data written to selected output vectors to be passed to Akaroa's global data analyser.

Akaroa's RNG is a Combined Multiple Recursive pseudorandom number generator (CMRG) with a period of approximately 2191 random numbers, and provides a unique stream of random numbers for every simulation engine.

Then you need to specify which output vectors you want to be under Akaroa control (by default, none of them are). You can use the *, ** wildcards (see section [10.3.1]) to place certain vectors under Akaroa control.

<modulename>.<vectorname1>.with-akaroa = true
<modulename>.<vectorname2>.with-akaroa = true

11.21.3.3 Using Shared File Systems

It is usually practical to have the same physical disk mounted (e.g. via NFS or Samba) on all computers in the cluster. However, because all OMNeT++ simulation processes run with the same settings, they would overwrite each other's output files. Your can prevent this from happening using the fname-append-host ini file entry:

[General]
fname-append-host = true

When turned on, it appends the host name to the names of the output files (output vector, output scalar, snapshot files).



[Prev] [Next] [TOC] [Chapters]