[Prev] [Next] [TOC] [Chapters]

11 Running Simulations ¶

11.1 Introduction ¶

This chapter describes how to run simulations. It covers basic usage, user interfaces, running simulation campaigns, and many other topics.

11.2 Simulation Executables vs Libraries ¶

As we have seen in the Build chapter, simulations may be compiled to an executable or to a shared library. When the build output is an executable, it can be run directly. For example, the Fifo example simulation can be run with the following command:

$ ./fifo

Simulations compiled to a shared library can be run using the opp_run program. For example, if we compiled the Fifo simulation to a shared library on Linux, the build output would be a libfifo.so file that could be run with the following command:

$ opp_run -l fifo

The -l option tells opp_run to load the given shared library. The -l option will be covered in detail in section [11.9].

NOTE

fifo

opp_run

11.3 Command-Line Options ¶

The above commands illustrate just the simplest case. Usually you will need to add extra command-line options, for example to specify what ini file(s) to use, which configuration to run, which user interface to activate, where to load NED files from, and so on. The rest of this chapter will cover these options.

To get a complete list of command line options accepted by simulations, run the opp_run program (or any other simulation executable) with -h:

$ opp_run -h

Or:

$ ./fifo -h

11.4 Configuration Options on the Command Line ¶

Configuration options can also be specified on the command line, not only in ini files. To do so, prefix the option name with a double dash, and append the value with an equal sign. Be sure not to have spaces around the equal sign. If the value contains spaces or shell metacharacters, you'll need to protect the value (or the whole option) with quotes or apostrophes.

Example:

$ ./fifo --debug-on-errors=true

In case an option is specified both on the command line and in an ini file, the command line takes precedence.

To get the list of all possible configuration options, use the -h config option. (The additional -s option below just makes the output less verbose.)

$ opp_run -s -h config
Supported configuration options:
  **.bin-recording=<bool>, default:true; per-object setting
  check-signals=<bool>, default:true; per-run setting
  cmdenv-autoflush=<bool>, default:false; per-run setting
  cmdenv-config-name=<string>; global setting
  ...

To see the option descriptions as well, use -h configdetails.

$ opp_run -h configdetails

11.5 Specifying Ini Files ¶

The default ini file is omnetpp.ini, and is loaded if no other ini file is given on the command line.

Ini files can be specified both as plain arguments and with the -f option, so the following two commands are equivalent:

$ ./fifo experiment.ini common.ini
$ ./fifo -f experiment.ini -f common.ini

Multiple ini files can be given, and their contents will be merged. This allows for partitioning the configuration into separate files, for example to simulation options, module parameters and result recording options.

11.6 Specifying the NED Path ¶

NED files are loaded from directories listed on the NED path. More precisely, they are loaded from the listed directories and their whole subdirectory trees. Directories are separated with a semicolon (;).

NOTE

The NED path can be specified in several ways:

using the NEDPATH environment variable
using the -n command-line option
in ini files, with the ned-path configuration option

NED path resolution rules are as follows:

OMNeT++ checks for NED path specified on the command line with the -n option
If not found on the command line, it checks for the NEDPATH environment variable
The ned-path option value from the ini file is appended to the result of the above steps
If the result is still empty, it falls back to "." (the current directory)

11.7 Selecting a User Interface ¶

OMNeT++ simulations can be run under different user interfaces a.k.a. runtime environments. Currently the following user interfaces are supported:

Qtenv: Qt-based graphical user interface, available since OMNeT++ 5.0
Tkenv: the traditional, Tcl/Tk-based graphical user interface
Cmdenv: command-line user interface for batch execution

You would typically test and debug your simulation under Tkenv or Qtenv, then run actual simulation experiments from the command line or shell script, using Cmdenv. Tkenv and Qtenv are also better suited for educational and demonstration purposes.

User interfaces are provided in the form of libraries that can be linked with statically, dynamically, or can be loaded at runtime.

[Via the -l option, see section [11.9]]

When several user interface libraries are available in a simulation program, the user can select via command-line or ini file options which one to use. In the absence of such an option, the one with the highest priority will be started. Currently priorities are set such that Qtenv has the highest priority, then Tkenv, and finally Cmdenv. By default, simulations are linked with all available user interfaces, but this can be controlled via opp_makemake options or in the OMNeT++ global build configuration as well. The user interfaces available in a simulation program can be listed by running it the -h userinterfaces option.

You can explicitly select a user interface on the command line with the -u option (specify Qtenv, Tkenv or Cmdenv as its argument), or by adding the user-interface option to the configuration. If both the config option and the command line option are present, the command line option takes precedence.

Since the graphical interfaces are the default (have higher priority), the most common use of the -u option is to select Cmdenv, e.g. for batch execution. The following example performs all runs of the Aloha example simulation using Cmdenv:

$ ./aloha -c PureAlohaExperiment -u Cmdenv

11.8 Selecting Configurations and Runs ¶

All user interfaces support the -c <configname> and -r <runfilter> options for selecting which simulation(s) to run.

The -c option expects the name of an ini file configuration as an argument. The -r option may be needed when the configuration expands to multiple simulation runs. That is the case when the configuration defines a parameter study (see section [10.4]), or when it contains a repeat configuration option that prescribes multiple repetitions with different RNG seeds (see section [10.4.6]). The -r option can then be used to select a subset of all runs (or one specific run, for that matter). A missing -r option selects all runs in the given configuration.

It depends on the particular user interface how it interprets the -c and -r options. Cmdenv performs all selected simulation runs (optionally stopping after the first one that finishes with an error). GUI interfaces like Qtenv and Tkenv may use this information to fill the run selection dialog (or to set up the simulation automatically if there is only one matching run.)

11.8.1 Run Filter Syntax ¶

The run filter accepts two syntaxes: a comma-separated list of run numbers or run number ranges (for example 1,2,5-10), or an arithmetic expression. The arithmetic expression is similar to constraint expressions in the configuration (see section [10.4.5]). It may refer to iteration variables and to the repeat counter with the dollar syntax: $numHosts, $repetition. An example: $numHosts>10 && $mean==2.

Note that due to the presence of the dollar sign (and spaces), the expression should be protected against shell expansion, e.g. using apostrophes:

$ ./aloha -c PureAlohaExperiment -r '$numHosts>10 && $mean<2'

Or, with double quotes:

$ ./aloha -c PureAlohaExperiment -r "\$numHosts>10 && \$mean<2"

11.8.2 The Query Option ¶

The -q (query) option complements -c and -r, and allows one to list the runs matched by the run filter. -q expects an argument that defines the format and verbosity of the output. Several formats are available: numruns, runnumbers, runs, rundetails, runconfig. Use opp_run -h to get a complete list.

-q runs prints one line of information with the iteration variables about each run that the run filter matches. An example:

$ ./aloha -s -c PureAlohaExperiment -r '$numHosts>10 && $mean<2' -q runs
Run 14: $numHosts=15, $mean=1, $repetition=0
Run 15: $numHosts=15, $mean=1, $repetition=1
Run 28: $numHosts=20, $mean=1, $repetition=0
Run 29: $numHosts=20, $mean=1, $repetition=1

The -s option just makes the output less verbose.

If you need more information, use -q rundetails or -q runconfig. rundetails is like numruns, but it also prints the values of the iteration variables and a summary of the configuration (the expanded values of configuration entries that contain iteration variables) for each matching run:

$ ./aloha -s -c PureAlohaExperiment -r '$numHosts>10 && $mean<2' -q rundetails
Run 14: $numHosts=15, $mean=1, $repetition=0
    Aloha.numHosts = 15
    Aloha.host[*].iaTime = exponential(1s)

Run 15: $numHosts=15, $mean=1, $repetition=1
    Aloha.numHosts = 15
    Aloha.host[*].iaTime = exponential(1s)
...

The numruns and runnumbers formats are mainly intended for use in scripts. They just print the number of matching runs and the plain run number list, respectively.

$ ./aloha -s -c PureAlohaExperiment -r '$numHosts>10 && $mean<2' -q numruns
4
$ ./aloha -s -c PureAlohaExperiment -r '$numHosts>10 && $mean<2' -q runnumbers
 14 15 28 29

The -q option encapsulates some unrelated functionality, as well: -q sectioninheritance ignores -r, and prints the inheritance chain of the inifile sections (the inheritance graph after linearization) for the configuration denoted by -c.

11.9 Loading Extra Libraries ¶

OMNeT++ allows you to load shared libraries at runtime. These shared libraries may contain model code (e.g. simple module implementation classes), dynamically registered classes that extend the simulator's functionality (for example NED functions, result filters/recorders, figures types, schedulers, output vector/scalar writers, Qtenv inspectors, or even custom user interfaces), or other code.

HINT

Libraries can be specified with the -l <libraryname> command line option (there can be several -l's on the command line), or with the load-libs configuration option. The values from the command line and the config file will be merged.

The prefix and suffix from the library name can be omitted (the extensions .dll, .so, .dylib, and also the common lib prefix on Unix systems). This means that you can specify the library name in a platform independent way: if you specify -l foo, then OMNeT++ will look for foo.dll, libfoo.dll, libfoo.so or libfoo.dylib, depending on the platform.

OMNeT++ will use the dlopen() or LoadLibrary() system call to load the library. To ensure that the system call finds the file, either specify the library name with a full path (pre- and postfixes of the library file name still can be omitted), or adjust the shared library path environment variable of your OS: PATH on Windows, LD_LIBRARY_PATH on Unix, and DYLD_LIBRARY_PATH on Mac OS X.

NOTE

11.10 Stopping Condition ¶

The most common way of specifying when to finish the simulation is to set a time limit. There are several time limits that can be set with the following configuration options:

sim-time-limit : Limits how long the simulation should run (in simulation time)
cpu-time-limit : Limits how much CPU time the simulation can use
real-time-limit : Limits how long the simulation can run (in real time)

NOTE

cpu-time-limit

real-time-limit

cpu-time-limit

real-time-limit

An example:

$ ./fifo --sim-time-limit=500s

If several time limits are set together, the simulation will stop when the first one is hit.

If needed, the simulation may also be stopped programmatically, for example when results of a (steady-state) simulation have reached the desired accuracy. This can be done by calling the endSimulation() method.

11.11 Controlling the Output ¶

The following options can be used to enable/disable the creation of various output files during simulation.

record-eventlog : Turns on the recording of the simulator events into an event log file. The resulting .elog file can be analyzed later in the IDE with the Sequence Chart tool.
scalar-recording : This option is originally a per-object setting, intended for selectively turning on or off the recording of certain scalar results. However, when it is specified globally to turn off all scalars, no output scalar file (.sca) will be created either.
vector-recording : Similar to scalar-recording, this option can be used to turn off creating an output vector file (.vec).
cmdenv-redirect-output : This is a Cmdenv-specific option, only mentioned here for completeness. It tells Cmdenv to save its standard output to files, one file per run. This option is mainly helpful when running simulation batches.

These configuration options, like any other, can be specified both in ini files and on the command line. An example:

$ ./fifo --record-eventlog=true --scalar-recording=false --vector-recording=false

11.12 Debugging ¶

Debugging is a task that comes up often during model development. The following configuration options are related to C++ debugging:

debug-on-errors : If the runtime detects any error, it will trigger a debugger trap (programmatic breakpoint) so you will be able to check the location and the context of the problem in your debugger. This option does not start a debugger, the simulation must already have been launched under a debugger.
debugger-attach-on-error : Controls just-in-time debugging. When this option is enabled and an error occurs during simulation, the simulation program will launch an external debugger, and have it attached to the simulation process. Related configuration options are debugger-attach-on-startup, debugger-attach-command and debugger-attach-wait-time.

An example that launches the simulation under the gdb debugger:

$ gdb --args ./aloha --debug-on-errors=true

11.13 Debugging Leaked Messages ¶

The most common cause of memory leaks in OMNeT++ simulations is forgetting to delete messages. When this happens, the simulation process will continually grow in size as the simulation progresses, and when left to run long enough, it will eventually cause an out-of-memory condition.

Luckily, this problem is easy to indentify, as all user interfaces display the number of message objects currently in the system. Take a look at the following example Cmdenv output:

...
** Event #1908736   t=58914.051870113485   Elapsed: 2.000s (0m 02s)
     Speed:     ev/sec=954368   simsec/sec=29457   ev/simsec=32.3987
     Messages:  created: 561611   `\tbf{present:\ 21}`   in FES: 34
** Event #3433472   t=106067.401570204991   Elapsed: 4.000s (0m 04s)
     Speed:     ev/sec=762368   simsec/sec=23576.7   ev/simsec=32.3357
     Messages:  created: 1010142   `\tbf{present:\ 354}`   in FES: 27
** Event #5338880   t=165025.763387178965   Elapsed: 6.000s (0m 06s)
     Speed:     ev/sec=952704   simsec/sec=29479.2   ev/simsec=32.3179
     Messages:  created: 1570675   `\tbf{present:\ 596}`   in FES: 21
** Event #6850304   t=211763.433233042017   Elapsed: 8.000s (0m 08s)
     Speed:     ev/sec=755712   simsec/sec=23368.8   ev/simsec=32.3385
     Messages:  created: 2015318   `\tbf{present:\ 732}`   in FES: 38
** Event #8753920   t=270587.781554343184   Elapsed: 10.000s (0m 10s)
     Speed:     ev/sec=951808   simsec/sec=29412.2   ev/simsec=32.361
     Messages:  created: 2575634   `\tbf{present:\ 937}`   in FES: 32 
** Event #10270208   t=317495.244698246477   Elapsed: 12.000s (0m 12s)
     Speed:     ev/sec=758144   simsec/sec=23453.7   ev/simsec=32.3251
     Messages:  created: 3021646   `\tbf{present:\ 1213}`   in FES: 20
...

The interesting parts are in bold font. The steadily increasing numbers are an indication that the simulation model, i.e. one or more modules in it, are missing some delete msg calls. It is best to use Qtenv or Tkenv to narrow down the issue to specific modules and/or message types.

Qtenv and Tkenv are also able to display the number of messages currently in the simulation. The numbers are displayed on the status bar. If you find that the number of messages is steadily increasing, you need to find where the message objects are located. This can be done with the help of the Find/Inspect Objects dialog.

If the number of messages is stable, it is still possible that the simulation is leaking other cObject-based objects; they can also be found using the Find/Inspect Objects dialog.

If the simulation is leaking non-OMNeT++ objects (i.e. not something derived from cObject) or other memory blocks, Cmdenv, Tkenv or Qtenv cannot help in tracking down the issue.

11.14 Debugging Other Memory Problems ¶

Technically, memory leaks are only a subset of problems associated with memory allocations, i.e. the usage of new and delete in C++.

memory leaks, that is, forgetting to delete objects or memory blocks no longer used, usually just prevents the user from being able to run the simulation program long enough;
dereferencing dangling pointers, i.e. accessing an already deleted object or memory block (or trying to delete one for a second time) usually results in a crash;
heap corruption, caused by e.g. writing past the end of an allocated array, usually also results in a crash.

There are specialized tools that can help in tracking down memory allocation problems (memory leak, double-deletion, referencing deleted blocks, etc). Some of these tools are listed below.

Valgrind, our primary recommendation, is a CPU emulator and memory debugger tool for Linux.
Other memory debugger libraries/tools include MemProf, MPatrol, dmalloc and ElectricFence. Most of these tools support tracking down memory leaks as well as detecting double deletion, writing past the end of an allocated block, etc.
There are several commercial offerings as well, e.g. Purify and Insure++.

11.15 Profiling ¶

When a simulation runs correctly but is too slow, you might want to profile it. Profiling basically means collecting runtime information about how much time is spent at various parts of the program, in order to find places where optimizing the code would have the most impact.

However, there are a few other options you can try before resorting to profiling and optimizing. First, verify that it is the simulation itself that is slow. Make sure features like eventlog recording is not accidentally turned on. Run the simulation under Cmdenv to eliminate any possible overhead from Qtenv/Tkenv. If you must run the simulation under Qtenv/Tkenv, you can still gain speed by disabling animation features, closing all inspectors, hiding UI elements like the timeline, and so on.

Also, compile your code in release mode (with make MODE=release, see [9.2.3]) instead of debug. That can make a huge difference, especially with heavily templated code.

HINT

Some profiling software:

Debuggers. A very simple but frequently useful way of profiling is stopping the program in a debugger from time to time, and looking at the stack trace before resuming (manual statistical profiling). If the program always stops at the same place in the code, that might be the bottleneck.
Valgrind/KCachegrind. KCachegrind is a graphical visualizer for traces generated by valgrind and its callgrind tool in Linux. These tools are free and open source software, packaged with most Linux distributions.
There are also commercial C/C++ profilers like RotateRight's Zoom. Profilers are also part of larger packages like PurifyPlus or Parasoft Insure++.

11.16 Checkpointing ¶

Debugging long-running simulations can be challenging, because one often needs to run the simulation for a long time just to get to the point of failure and be able to start debugging.

Checkpointing can facilitate debugging such errors. It is a technique that basically consists of saving a snapshot of the application's state, and being able to resume execution from there, even multiple times. OMNeT++ itself contains no checkpointing functionality, but it is available via external tools. It depends on the tool whether it is able to restore GUI windows (usually not.)

Some checkpointing software that is available on Linux:

Berkeley Lab Checkpoint/Restart (BLCR)
DMTCP (Distributed MultiThreaded Checkpointing)
CRIU is a user space checkpoint lib
Docker and the underlying technology contain a checkpoint and restore mechanism

11.17 Using Cmdenv ¶

Cmdenv is a lightweight, command line user interface that compiles and runs on all platforms. Cmdenv is designed primarily for batch execution.

Cmdenv simply executes some or all simulation runs that are described in the configuration file. The runs to be executed can be passed via command-line arguments or configuration options.

Cmdenv runs simulations in the same process. This means that e.g. if one simulation run writes a global variable, subsequent runs will also see the change. This is one reason why global variables in models are strongly discouraged.

11.17.1 Sample Output ¶

When you run the Fifo example under Cmdenv, you should see something like this:

$ ./fifo -u Cmdenv -c Fifo1

OMNeT++ Discrete Event Simulation  (C) 1992-2017 Andras Varga, OpenSim Ltd.
Version: 5.0, edition: Academic Public License -- NOT FOR COMMERCIAL USE
See the license for distribution terms and warranty disclaimer
Setting up Cmdenv...
Loading NED files from .: 5

Preparing for running configuration Fifo1, run #0...
Scenario: $repetition=0
Assigned runID=Fifo1-0-20090104-12:23:25-5792
Setting up network 'FifoNet'...
Initializing...
Initializing module FifoNet, stage 0
Initializing module FifoNet.gen, stage 0
Initializing module FifoNet.fifo, stage 0
Initializing module FifoNet.sink, stage 0

Running simulation...
** Event #1   t=0   Elapsed: 0.000s (0m 00s)  0% completed
     Speed:     ev/sec=0   simsec/sec=0   ev/simsec=0
     Messages:  created: 2   present: 2   in FES: 1
** Event #232448   t=11719.051014922336   Elapsed: 2.003s (0m 02s)  3% completed
     Speed:     ev/sec=116050   simsec/sec=5850.75   ev/simsec=19.8351
     Messages:  created: 58114   present: 3   in FES: 2
...
** Event #7206882   t=360000.52066583684   Elapsed: 78.282s (1m 18s)  100% completed
     Speed:     ev/sec=118860   simsec/sec=5911.9   ev/simsec=20.1053
     Messages:  created: 1801723   present: 3   in FES: 2

<!> Simulation time limit reached -- simulation stopped.

Calling finish() at end of Run #0...
End.

As Cmdenv runs the simulation, it periodically prints the sequence number of the current event, the simulation time, the elapsed (real) time, and the performance of the simulation (how many events are processed per second; the first two values are 0 because there wasn't enough data for it to calculate yet). At the end of the simulation, the finish() methods of the simple modules are run, and the outputs from them are displayed.

11.17.2 Selecting Runs, Batch Operation ¶

The most important command-line options for Cmdenv are -c and -r for selecting which simulations to perform. (They were described in section [11.8].) They also have their equivalent configuration options that can be written in files as well: cmdenv-config-name and cmdenv-runs-to-execute.

Another configuration option, cmdenv-stop-batch-on-error controls Cmdenv's behavior when performing multiple runs: it determines whether Cmdenv should stop after the first run that finishes with an error. By default, it does.

When performing multiple runs, Cmdenv prints run statistics at the end. Example output:

$ ./aloha -c PureAlohaExperiment -u Cmdenv
...
Run statistics: total 42, successful 30, errors 1, skipped 11

11.17.3 Express Mode ¶

Cmdenv can execute simulations in two modes:

Normal (non-express) mode is for debugging; detailed information will be written to the standard output (event banners, module log, etc).
Express mode can be used for long simulation runs; only periodical status updates are displayed about the progress of the simulation.

The default mode is Express. To turn off Express mode, specify false for the cmdenv-express-mode configuration option:

$ ./fifo -u Cmdenv -c Fifo1 --cmdenv-express-mode=false

There are several other options that also affect Express-mode and Normal mode behavior:

Express: cmdenv-performance-display, cmdenv-status-frequency
Normal: cmdenv-event-banners, cmdenv-event-banner-details, cmdenv-log-level, cmdenv-log-prefix, etc.

See Appendix [26] for more information about these options.

11.17.3.1 Interpreting Express-Mode Output ¶

When the simulation is running in Express mode with detailed performance display enabled (cmdenv-performance-display=true), Cmdenv periodically outputs a three-line status report about the progress of the simulation. The output looks like this:

...
** Event #250000   t=123.74354 ( 2m  3s)    Elapsed: 0m 12s
     Speed:     ev/sec=19731.6   simsec/sec=9.80713   ev/simsec=2011.97
     Messages:  created: 55532   present: 6553   in FES: 8
** Event #300000   t=148.55496 ( 2m 28s)    Elapsed: 0m 15s
     Speed:     ev/sec=19584.8   simsec/sec=9.64698   ev/simsec=2030.15
     Messages:  created: 66605   present: 7815   in FES: 7
...

The first line of the status display (beginning with **) contains:

how many events have been processed so far
the current simulation time (t), and
the elapsed time (wall clock time) since the beginning of the simulation run.

The second line displays simulation performance metrics:

ev/sec indicates performance: how many events are processed in one real-time second. On one hand it depends on your hardware (faster CPUs process more events per second), and on the other hand it depends on the complexity (amount of calculations) associated with processing one event. For example, protocol simulations tend to require more processing per event than e.g. queueing networks, thus the latter produce higher ev/sec values. In any case, this value is largely independent of the size of your model, i.e. the number of modules in it.
simsec/sec shows relative speed of the simulation, that is, how fast the simulation is progressing compared to real time, how many simulated seconds can be done in one real second. This value virtually depends on everything: on the hardware, on the size of the simulation model, on the complexity of events, and the average simulation time between events as well.
ev/simsec is the event density: how many events are there per simulated second. Event density only depends on the simulation model, regardless of the hardware used to simulate it: in a high-speed optical network simulation you will have very high values (10⁹), while in a call center simulation this value is probably well under 1. It also depends on the size of your model: if you double the number of modules in your model, you can expect the event density to double, too.

The third line displays the number of messages, and it is important because it may indicate the “health” of your simulation.

Created: total number of message objects created since the beginning of the simulation run. This does not mean that this many message object actually exist, because some (many) of them may have been deleted since then. It also does not mean that you created all those messages -- the simulation kernel also creates messages for its own use (e.g. to implement wait() in an activity() simple module).
Present: the number of message objects currently present in the simulation model, that is, the number of messages created (see above) minus the number of messages already deleted. This number includes the messages in the FES.
In FES: the number of messages currently scheduled in the Future Event Set.

The second value, the number of messages present, is more useful than perhaps one would initially think. It can be an indicator of the “health” of the simulation; if it is growing steadily, then either you have a memory leak and losing messages (which indicates a programming error), or the network you simulate is overloaded and queues are steadily filling up (which might indicate wrong input parameters).

Of course, if the number of messages does not increase, it does not mean that you do not have a memory leak (other memory leaks are also possible). Nevertheless the value is still useful, because by far the most common way of leaking memory in a simulation is by not deleting messages.

11.17.4 Other Options ¶

Cmdenv has more configuration options than mentioned in this section; see the options beginning with cmdenv- in Appendix [26] for the complete list.

11.18 The Qtenv Graphical User Interface ¶

Qtenv is a runtime simulation GUI. Qtenv supports interactive simulation execution, animation, tracing and debugging. Qtenv is recommended in the development stage of a simulation and for presentation purposes, since it allows one to get a detailed picture of the state of simulation at any point of execution and to follow what happens inside the network. Note that 3D visualization support and smooth animation support are only available in Qtenv. As of version OMNeT++ 5.1, Qtenv is the default user interface, and Tkenv is in maintenance mode.

NOTE

11.18.1 Command-Line and Configuration Options ¶

Simulations run under Qtenv accept all general command line and configuration options, including -c and -r. The configuration options specific to Qtenv include:

qtenv-default-config: Specifies which config Qtenv should set up automatically on startup. The default is to ask the user. This option is equivalent to the -c command-line option.
qtenv-default-run: Specifies which run (of the default config, see qtenv-default-config) Qtenv should set up automatically on startup. The default is to ask the user. This option is equivalent to the -r command-line option.
qtenv-extra-stack: Specifies the extra amount of stack that is reserved for each activity() simple module when the simulation is run under Qtenv.

Qtenv is also affected by the following option:

image-path: Specifies the path for loading module icons. This option is shared between Tkenv and Qtenv.

See Appendix [26] for the list of possible configuration options.

11.19 The Tkenv Graphical User Interface ¶

Tkenv is the traditional Tcl/Tk-based graphical runtime user interface. Tkenv supports interactive simulation execution, tracing and debugging. Tkenv is recommended in the development stage of a simulation and for presentation purposes, since it allows one to get a detailed picture of the state of simulation at any point of execution and to follow what happens inside the network.

NOTE

11.19.1 Command-Line and Configuration Options ¶

Simulations run under Tkenv accept all general command line and configuration options, including -c and -r. The configuration options specific to Tkenv include:

tkenv-default-config: Specifies which config Tkenv should set up automatically on startup. The default is to ask the user. This option is equivalent to the -c command-line option.
tkenv-default-run: Specifies which run (of the default config, see tkenv-default-config) Tkenv should set up automatically on startup. The default is to ask the user. This option is equivalent to the -r command-line option.
tkenv-extra-stack: Specifies the extra amount of stack that is reserved for each activity() simple module when the simulation is run under Tkenv.
tkenv-plugin-path: Specifies the search path for Tkenv plugins. Tkenv plugins are .tcl files that get evaluated on startup.

Tkenv is also affected by the following option:

image-path: Specifies the path for loading module icons. This one was named tkenv-image-path in OMNeT++ 4.x and renamed, because from version 5.0 it is shared between Tkenv and Qtenv.

See Appendix [26] for the list of possible configuration options.

11.20 Running Simulation Campaigns ¶

Once your model works reliably, you will usually want to run several simulations, either to explore the parameter space via a parameter study (see section [10.4]), or to do multiple repetitions with different RNG seeds to increase the statistical accuracy of the results (see section [10.4.6]).

In this section, we will explore several ways to run batches of simulations efficently.

11.20.1 The Naive Approach ¶

Assume that you want to run the parameter study in the Aloha example simulation for the numHosts>15 cases.

The first idea is that Cmdenv is capable of running simulation batches. The following command will do the job:

$ ./aloha -u Cmdenv -c PureAlohaExperiment -r '$numHosts>15'
...
Run statistics: total 14, successful 14
End.

It works fine. However, this approach has some drawbacks which becomes apparent when running hundreds or thousands of simulation runs.

It uses only one CPU. In the age of multi-core CPUs, this is not very efficient.
More prone to C++ programming errors in the model. A failure in a single run may abort execution (segfault) or corrupt the process state, possibly invalidating the results of subsequent runs.

To address the second drawback, we can execute each simulation run in its own Cmdenv instance.

$ ./aloha -c PureAlohaExperiment -r '$numHosts>15' -s -q runnumbers
28 29 30 31 32 33 34 35 36 37 38 39 40 41
$ ./aloha -u Cmdenv -c PureAlohaExperiment -r 28
$ ./aloha -u Cmdenv -c PureAlohaExperiment -r 29
$ ./aloha -u Cmdenv -c PureAlohaExperiment -r 30
...
$ ./aloha -u Cmdenv -c PureAlohaExperiment -r 41

It's a lot of commands to issue manually, but luckily it can be automated with a shell script like this:

#! /bin/sh
RUNS=$(./aloha -c PureAlohaExperiment -r '$numHosts>15' -s -q runnumbers)
for i in $RUNS; do
    ./aloha -u Cmdenv -c PureAlohaExperiment -r $i
done

Save the above into a text file called e.g. runAloha. Then give it executable permission, and run it:

$ chmod +x runAloha
$ ./runAloha

It will execute the simulations one-by-one, each in its own Cmdenv instance.

This approach involves a process start overhead for each simulation. Normally, this overhead is small compared to the time spent simulating. However, it may become more of a problem when running a large number of very short simulations (<<1s in CPU time). This effect may be mitigated by letting Cmdenv do several (e.g. 10) simulations in one go.

And then, the script still uses only one CPU. It would be better to keep all CPUs busy. For example, if you have 8 CPUs, there should be eight processes running all the time -- when one terminates, another would be launched in its place. You might notice that this behavior is similar to what GNU Make's -j<numJobs> option does. The opp_runall utility, to be covered in the next section, exploits GNU Make to schedule the running of simulations on multiple CPUs.

11.20.2 Using opp_runall ¶

OMNeT++ has a utility program called opp_runall, which allows you to execute simulations using multiple CPUs and multiple processes.

opp_runall groups simulation runs into batches. Every batch corresponds to a Cmdenv process, that is, runs of a batch execute sequentially inside the same Cmdenv process. Batches (i.e. Cmdenv instances) are scheduled for running so that they keep all CPUs busy. The batch size as well as the number of CPUs to use have sensible defaults but can be overridden.

11.20.2.1 Command Line ¶

opp_runall expects the normal simulation command in its argument list. The first positional (non-option) argument and all following arguments are treated as the simulation command (simulation program and its arguments).

Thus, to modify a normal Cmdenv simulation command to make use of multiple CPUs, simply prefix it with opp_runall:

$ opp_runall ./aloha -u Cmdenv -c PureAlohaExperiment -r '$numHosts>15'

Options intended for opp_runall should come before the the simulation command. These options include -b<N> for specifying the batch size, and -j<N> to specify the number of CPUs to use.

$ opp_runall -j8 -b4 ./aloha -u Cmdenv -c PureAlohaExperiment -r '$numHosts>15'

11.20.2.2 How It Works ¶

First, opp_runall invokes the simulation command with extra command arguments (-s -q runnumbers) to figure out the list of runs it needs to perform, and groups the run numbers into batches. Then it exploits GNU make and its -j<N> option to do the heavy lifting. Namely, it generates a temporary makefile that allows make to run batches in parallel, and invokes make with the appropriate -j option. It is also possible to export the makefile for inspection and/or running it manually.

To illustrate the above, here is the content of such a makefile:

#
# This makefile was generated with the following command:
# opp_runall -j2 -b4 -e tmp ./aloha -u Cmdenv -c PureAlohaExperiment -r $numHosts>15 
#

SIMULATIONCMD = ./aloha -u Cmdenv -c PureAlohaExperiment -s \ 
                --cmdenv-redirect-output=true
TARGETS =  batch0 batch1 batch2 batch3

.PHONY: $(TARGETS)

all: $(TARGETS)
    @echo All runs completed.

batch0:
    $(SIMULATIONCMD) -r 28,29,30,31

batch1:
    $(SIMULATIONCMD) -r 32,33,34,35

batch2:
    $(SIMULATIONCMD) -r 36,37,38,39

batch3:
    $(SIMULATIONCMD) -r 40,41

11.20.3 Exploiting Clusters ¶

With large scale simulations, using one's own desktop computer might not be enough. The solution could be to run the simulation on remote machines, that is, to employ a computing cluster.

In simple setups, cross-mounting the file system that contains OMNeT++ and the model, and using ssh to run the simulations might already provide a good solution.

In other cases, submitting simulation jobs and harvesting the results might be done via batch-queuing, cluster computing or grid computing middleware. The following list contains some pointers to such software:

HTCondor, previously called Condor, is an open source software package that enables High Throughput Computing (HTC) on large collections of distributively owned computing resources. HTCondor can manage a dedicated cluster of workstations, and it can also harness non-dedicated, preexisting resources under distributed ownership. A user can submit jobs to HTCondor. HTCondor finds an available machine on the network and begins running the job on that machine. HTCondor also supports checkpointing and migrating jobs.
Open Grid Scheduler/Grid Engine is a commercially supported open-source batch-queuing system for distributed resource management. OGS/GE is based on Sun Grid Engine (SGE), and maintained by the same group of external (i.e. non-Sun) developers who started contributing code since 2001. There is also a commercial SGE successor, Univa Grid Engine, formerly called Oracle Grid Engine.
Slurm Workload Manager, or Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters.
Apple's Xgrid has unfortunately been removed from Mac OS X with the release of Mountain Lion (2012). Xgrid was distributed computing for the masses -- easy, plug and play, not complicated. You could network your Mac computers together, and use that power on one computer to do something that took a lot of computing power. Currently, Pooch is advertised as software providing the easiest way to assemble and operate a high-performance parallel computer from Macs.

11.21 Akaroa Support: Multiple Replications in Parallel ¶

11.21.1 Introduction ¶

Typical simulations are Monte-Carlo simulations: they use (pseudo-)random numbers to drive the simulation model. For the simulation to produce statistically reliable results, one has to carefully consider the following:

When the initial transient is over, when can we start collecting data? We usually don't want to include the initial transient when the simulation is still “warming up.”
When can we stop the simulation? We want to wait long enough so that the statistics we are collecting can “stabilize”, or reach the required sample size to be statistically trustable.

Neither question is trivial to answer. One might just suggest to wait “very long” or “long enough”. However, this is neither simple (how do you know what is “long enough”?) nor practical (even with today's high speed processors simulations of modest complexity can take hours, and one may not afford multiplying runtimes by, say, 10, “just to be safe.”) If you need further convincing, please read [Pawlikowsky02] and be horrified.

A possible solution is to look at the statistics while the simulation is running, and decide at runtime when enough data have been collected for the results to have reached the required accuracy. One possible criterion is given by the confidence level, more precisely, by its width relative to the mean. But ex ante it is unknown how many observations have to be collected to achieve this level -- it must be determined at runtime.

11.21.2 What Is Akaroa ¶

Akaroa [Akaroa99] addresses the above problem. According to its authors, Akaroa (Akaroa2) is a “fully automated simulation tool designed for running distributed stochastic simulations in MRIP scenario” in a cluster computing environment.

MRIP stands for Multiple Replications in Parallel. In MRIP, the computers of the cluster run independent replications of the whole simulation process (i.e. with the same parameters but different seed for the RNGs (random number generators)), generating statistically equivalent streams of simulation output data. These data streams are fed to a global data analyser responsible for analysis of the final results and for stopping the simulation when the results reach a satisfactory accuracy.

The independent simulation processes run independently of one another and continuously send their observations to the central analyser and control process. This process combines the independent data streams, and calculates from these observations an overall estimate of the mean value of each parameter. Akaroa2 decides by a given confidence level and precision whether it has enough observations or not. When it judges that is has enough observations it halts the simulation.

If n processors are used, the needed simulation execution time is usually n times smaller compared to a one-processor simulation (the required number of observations are produced sooner). Thus, the simulation would be sped up approximately in proportion to the number of processors used and sometimes even more.

Akaroa was designed at the University of Canterbury in Christchurch, New Zealand and can be used free of charge for teaching and non-profit research activities.

11.21.3 Using Akaroa with OMNeT++¶

11.21.3.1 Starting Akaroa ¶

Before the simulation can be run in parallel under Akaroa, you have to start up the system:

Start akmaster running in the background on some host.
On each host where you want to run a simulation engine, start akslave in the background.

Each akslave establishes a connection with the akmaster.

Then you use akrun to start a simulation. akrun waits for the simulation to complete, and writes a report of the results to the standard output. The basic usage of the akrun command is:

$ akrun -n num_hosts command [argument..]

where command is the name of the simulation you want to start. Parameters for Akaroa are read from the file named Akaroa in the working directory. Collected data from the processes are sent to the akmaster process, and when the required precision has been reached, akmaster tells the simulation processes to terminate. The results are written to the standard output.

The above description is not detailed enough to help you set up and successfully use Akaroa -- for that you need to read the Akaroa manual.

11.21.3.2 Configuring OMNeT++ for Akaroa ¶

First of all, you have to compile OMNeT++ with Akaroa support enabled.

The OMNeT++ simulation must be configured in omnetpp.ini so that it passes the observations to Akaroa. The simulation model itself does not need to be changed -- it continues to write the observations into output vectors (cOutVector objects, see chapter [7]). You can place some of the output vectors under Akaroa control.

You need to add the following to omnetpp.ini:

[General]
rng-class = "cAkaroaRNG"
outputvectormanager-class = "cAkOutputVectorManager"

These lines cause the simulation to obtain random numbers from Akaroa, and allows data written to selected output vectors to be passed to Akaroa's global data analyser.

[For more details on the plugin mechanism these settings make use of, see [17].]

Akaroa's RNG is a Combined Multiple Recursive pseudorandom number generator (CMRG) with a period of approximately 2¹⁹¹ random numbers, and provides a unique stream of random numbers for every simulation engine.

NOTE

Then you need to specify which output vectors you want to be under Akaroa control (by default, none of them are). You can use the *, ** wildcards (see section [10.3.1]) to place certain vectors under Akaroa control.

<modulename>.<vectorname1>.with-akaroa = true
<modulename>.<vectorname2>.with-akaroa = true

11.21.3.3 Using Shared File Systems ¶

It is usually practical to have the same physical disk mounted (e.g. via NFS or Samba) on all computers in the cluster. However, because all OMNeT++ simulation processes run with the same settings, they would overwrite each other's output files. Your can prevent this from happening using the fname-append-host ini file entry:

[General]
fname-append-host = true

When turned on, it appends the host name to the names of the output files (output vector, output scalar, snapshot files).

[Prev] [Next] [TOC] [Chapters]