[Prev] [Next] [TOC] [Chapters]

12 Result Recording and Analysis ¶

12.1 Result Recording ¶

OMNeT++ provides built-in support for recording simulation results, via output vectors and output scalars. Output vectors are time series data, recorded from simple modules or channels. You can use output vectors to record end-to-end delays or round trip times of packets, queue lengths, queueing times, module state, link utilization, packet drops, etc. -- anything that is useful to get a full picture of what happened in the model during the simulation run.

Output scalars are summary results, computed during the simulation and written out when the simulation completes. A scalar result may be an (integer or real) number, or may be a statistical summary comprised of several fields such as count, mean, standard deviation, sum, minimum, maximum, etc., and optionally histogram data.

Results may be collected and recorded in two ways:

Based on the signal mechanism, using declared statistics;
Directly from C++ code, using the simulation library

The second method has been the traditional way of recording results. The first method, based on signals and declared statistics, was introduced in OMNeT++ 4.1, and it is preferable because it allows you to always record the results in the form you need, without requiring heavy instrumentation or continuous tweaking of the simulation model.

12.1.1 Using Signals and Declared Statistics ¶

This approach combines the signal mechanism (see [4.14]) and NED properties (see [3.12]) in order to de-couple the generation of results from their recording, thereby providing more flexibility in what to record and in which form. The details of the solution have been described in section [4.15] in detail; here we just give a short overview.

Statistics are declared in the NED files with the @statistic property, and modules emit values using the signal mechanism. The simulation framework records data by adding special result file writer listeners to the signals. By being able to choose what listeners to add, the user can control what to record in the result files and what computations to apply before recording. The aforementioned section [4.15] also explains how to instrument simple modules and channels for signals-based result recording.

The signals approach allows for calculation of aggregate statistics (such as the total number of packet drops in the network) and for implementing a warm-up period without support from module code. It also allows you to write dedicated statistics collection modules for the simulation, also without touching existing modules.

The same configuration options that were used to control result recording with cOutVector and recordScalar() also work when utilizing the signals approach, and there are extra configuration options to make the additional possibilities accessible.

12.1.2 Direct Result Recording ¶

With this approach, scalar and statistics results are collected in class variables inside modules, then recorded in the finalization phase via recordScalar() calls. Vectors are recorded using cOutVector objects. Use cStdDev to record summary statistics like mean, standard deviation, minimum/maximum, and histogram-like classes (cHistogram, cPSquare, cKSplit) to record the distribution. These classes are described in sections [7.8] and [7.9]. Recording of individual vectors, scalars and statistics can be enabled or disabled via the configuration (ini file), and it is also the place to set up recording intervals for vectors.

The drawback of recording results directly from modules is that result recording is hardcoded in modules, and even simple requirement changes (e.g. record the average delay instead of each delay value, or vice versa) requires either code change or an excessive amount of result collection code in the modules.

12.2 Configuring Result Collection ¶

12.2.1 Result File Names ¶

Simulation results are recorded into output scalar files that actually hold statistics results as well, and output vector files. The usual file extension for scalar files is .sca, and for vector files .vec.

Every simulation run generates a single scalar file and a vector file. The file names can be controlled with the output-vector-file and output-scalar-file options. These options rarely need to be used, because the default values are usually fine. The defaults are:

output-vector-file = "${resultdir}/${configname}-${runnumber}.vec"
output-scalar-file = "${resultdir}/${configname}-${runnumber}.sca"

Here, ${resultdir} is the value of the result-dir configuration option which defaults to results/, and ${configname} and ${runnumber} are the name of the configuration name in the ini file (e.g. [Config PureAloha]), and the run number. Thus, the above defaults generate file names like results/PureAloha-0.vec, results/PureAloha-1.vec, and so on.

NOTE

omnetpp.vec

omnetpp.sca

output-scalar-file-append=true

12.2.2 Enabling/Disabling Result Items ¶

The recording of simulation results can be enabled/disabled at multiple levels with various configuration options:

All recording from a @statistic can be enabled/disabled together using the statistic-recording option;
Recording of a scalar or a statistic object can be controlled with the scalar-recording option;
Recording of an output vector can be controlled with the vector-recording option;
Recording of the bins of a histogram object can be controlled with the bin-recording option.

All the above options are boolean per-object options, thus, they have similar syntaxes:

<module-path>.<statistic-name>.statistic-recording = true/false
<module-path>.<scalar-name>.scalar-recording = true/false
<module-path>.<vector-name>.vector-recording = true/false
<module-path>.<histogram-name>.bin-recording = true/false

For example, all recording from the following statistic

@statistic[queueLength](record=max,timeavg,vector);

can disabled with this ini file line:

**.queueLength.statistic-recording = false

When a scalar, vector, or histogram is recorded using a @statistic, its name is derived from the statistic name, by appending the recording mode after a semicolon. For example, the above statistic will generate the scalars named queueLength:max and queueLength:timeavg, and the vector named queueLength:vector. Their recording can be individually disabled with the following lines:

**.queueLength:max.scalar-recording = false
**.queueLength:timeavg.scalar-recording = false
**.queueLength:vector.vector-recording = false

The statistic, scalar or vector name part in the key may also contain wildcards. This can be used, for example, to handle result items with similar names together, or, by using * as name, for filtering by module or to disable all recording. The following example turns off recording of all scalar results except those called latency, and those produced by modules named tcp:

**.tcp.*.scalar-recording = true
**.latency.scalar-recording = true
**.scalar-recording = false

To disable all result recording, use the following three lines:

**.statistic-recording = false
**.scalar-recording = false
**.vector-recording = false

The first line is not strictly necessary. However, it may improve runtime performance because it causes result recorders not to be added, instead of adding and then disabling them.

12.2.3 Selecting Recording Modes for Signal-Based Statistics ¶

Signal-based statistics recording has been designed so that it can be easily configured to record a “default minimal” set of results, a “detailed” set of results, and a custom set of results (by modifying the previous ones, or defined from scratch).

Recording can be tuned with the result-recording-modes per-object configuration option. The “object” here is the statistic, which is identified by the full path (hierarchical name) of the module or connection channel object in question, plus the name of the statistic (which is the “index” of @statistic property, i.e. the name in the square brackets). Thus, configuration keys have the syntax <module-path>.<statistic-name>.result-recording-modes=.

The result-recording-modes option accepts one or more items as value, separated by comma. An item may be a result recording mode (surprise!), and two words with a special meaning, default and all:

A result recording mode means any item that may occur in the record key of the @statistic property; for example, count, sum, mean, vector((count-1)/2).
default stands for the set of non-optional items from the @statistic property's record list, that is, those without question marks.
all means all items from the @statistic property's record list, including the ones with question marks.

The default value is default.

A lone “-” as option value disables all recording modes.

Recording mode items in the list may be prefixed with “+” or “-” to add/remove them from the set of result recording modes. The initial set of result recording modes is default; if the first item is prefixed with “+” or “-”, then that and all subsequent items are understood as modifying the set; if the first item does not start with with “+” or “-”, then it replaces the set, and further items are understood as modifying the set.

This sounds more complicated than it is; an example will make it clear. Suppose we are configuring the following statistic:

@statistic[foo](record=count,mean,max?,vector?);

With the following the ini file lines (see results in comments):

**.result-recording-modes = default  # --> count, mean
**.result-recording-modes = all      # --> count, mean, max, vector
**.result-recording-modes = -        # --> none
**.result-recording-modes = mean     # --> only mean (disables 'default')
**.result-recording-modes = default,-vector,+histogram # --> count,mean,histogram
**.result-recording-modes = -vector,+histogram         # --> same as the previous
**.result-recording-modes = all,-vector,+histogram  # --> count,mean,max,histogram

Here is another example which shows how to write a more specific option key. The following line applies to queueLength statistics of fifo[] submodule vectors anywhere in the network:

**.fifo[*].queueLength.result-recording-modes = +vector  # default plus vector

In the result file, the recorded scalars will be suffixed with the recording mode, i.e. the mean of queueingTime will be recorded as queueingTime:mean.

12.2.4 Warm-up Period ¶

The warmup-period option specifies the length of the initial warm-up period. When set, results belonging to the first x seconds of the simulation will not be recorded into output vectors, and will not be counted into the calculation of output scalars. This option is useful for steady-state simulations. The default is 0s (no warmup period).

Example:

warmup-period = 20s

Results recorded via signal-based statistics automatically obey the warm-up period setting, but modules that compute and record scalar results manually (via recordScalar()) need to be modified so that they take the warm-up period into account.

NOTE

recordScalar()

The warm-up period is available via the getWarmupPeriod() method of the simulation manager object, so the C++ code that updates the corresponding state variables needs to be surrounded with an if statement:

Old:

dropCount++;

New:

if (simTime() >= getSimulation()->getWarmupPeriod())
    dropCount++;

12.2.5 Output Vectors Recording Intervals ¶

The size of output vector files can easily reach several gigabytes, but very often, only some of the recorded statistics are interesting to the analyst. In addition to selecting which vectors to record, OMNeT++ also allows one to specify one or more collection intervals.

The latter can be configured with the vector-recording-intervals per-object option. The syntax of the configuration option is <module-path>.<vector-name>.vector-recording-intervals=<intervals>, where both <module-path> and <vector-name> may contain wildcards (see [10.3.1]). <vector-name> is the vector name, or the name string of the cOutVector object. By default, all output vectors are turned on for the whole duration the simulation.

One can specify one or more intervals in the <startTime>..<stopTime> syntax, separated by comma. <startTime> or <stopTime> need to be given with measurement units, and both can be omitted to denote the beginning and the end of the simulation, respectively.

The following example limits all vectors to three intervals, except dropCount vectors which will be recorded during the whole simulation run:

**.dropCount.vector-recording-intervals = 0..
**.vector-recording-intervals = 0..1000s, 5000s..6000s, 9000s..

12.2.6 Recording Event Numbers in Output Vectors ¶

A third per-vector configuration option is vector-record-eventnumbers, which specifies whether to record event numbers for an output vector. (Simulation time and value are always recorded. Event numbers are needed by the Sequence Chart Tool, for example.) Event number recording is enabled by default; it may be turned off to save disk space.

**.vector-record-eventnumbers = false

If the (default) cIndexedFileOutputVectorManager class is used to record output vectors, there are two more options to fine-tune its resource usage. output-vectors-memory-limit specifies the total memory that can be used for buffering output vectors. Larger values produce less fragmented vector files (i.e. cause vector data to be grouped into larger chunks), and therefore allow more efficient processing later. vector-max-buffered-values specifies the maximum number of values to buffer per vector, before writing out a block into the output vector file. The default is no per-vector limit (i.e. only the total memory limit is in effect.)

12.2.7 Saving Parameters as Scalars ¶

When you are running several simulations with different parameter settings, you'll usually want to refer to selected input parameters in the result analysis as well -- for example when drawing a throughput (or response time) versus load (or network background traffic) plot. Average throughput or response time numbers are saved into the output scalar files, and it is useful for the input parameters to get saved into the same file as well.

For convenience, OMNeT++ automatically saves the iteration variables into the output scalar file if they have numeric value, so they can be referred to during result analysis.

WARNING

**.param = exponential( ${mean=0.2s, 0.4s, 0.6s} )  #WRONG!
**.param = exponential( ${mean=0.2, 0.4, 0.6}s )    #OK

Module parameters can also be saved, but this has to be requested by the user, by configuring param-record-as-scalar=true for the parameters in question. The configuration key is a pattern that identifies the parameter, plus .param-record-as-scalar. An example:

**.host[*].networkLoad.param-record-as-scalar = true

This looks simple enough, however there are three pitfalls: non-numeric parameters, too many matching parameters, and random-valued volatile parameters.

First, the scalar file only holds numeric results, so non-numeric parameters cannot be recorded -- that will result in a runtime error.

Second, if wildcards in the pattern match too many parameters, that might unnecessarily increase the size of the scalar file. For example, if the host[] module vector size is 1000 in the example below, then the same value (3) will be saved 1000 times into the scalar file, once for each host.

**.host[*].startTime = 3
**.host[*].startTime.param-record-as-scalar = true  # saves "3" once for each host

Third, recording a random-valued volatile parameter will just save a random number from that distribution. This is rarely what you need, and the simulation kernel will also issue a warning if this happens.

**.interarrivalTime = exponential(1s)
**.interarrivalTime.param-record-as-scalar = true  # wrong: saves random values!

These pitfalls are quite common in practice, so it is usually better to rely on the iteration variables in the result analysis. That is, one can rewrite the above example as

**.interarrivalTime = exponential( ${mean=1}s )

and refer to the $mean iteration variable instead of the interarrivalTime module parameter(s) during result analysis. param-record-as-scalar=true is not needed, because iteration variables are automatically saved into the result files.

12.2.8 Recording Precision ¶

Output scalar and output vector files are text files, and floating point values (doubles) are recorded into it using fprintf()'s "%g" format. The number of significant digits can be configured using the output-scalar-precision and output-vector-precision configuration options.

The default precision is 12 digits. The following has to be considered when setting a different value:

IEEE-754 doubles are 64-bit numbers. The mantissa is 52 bits, which is roughly equivalent to 16 decimal places (52*log(2)/log(10)). However, due to rounding errors, usually only 12..14 digits are correct, and the rest is pretty much random garbage which should be ignored. However, when you convert the decimal representation back into a double for result processing, an additional small error will occur, because 0.1, 0.01, etc. cannot be accurately represented in binary. This conversion error is usually smaller than what that the double variable already had before recording into the file. However, if it is important, you can eliminate this error by setting the recording precision to 16 digits or more (but again, be aware that the last digits are garbage). The practical upper limit is 17 digits, setting it higher doesn't make any difference in fprintf()'s output. Errors resulting from converting to/from decimal representation can be eliminated by choosing an output vector/output scalar manager class which stores doubles in their native binary form. The appropriate configuration options are outputvectormanager-class and outputvectormanager-class. For example, cMySQLOutputScalarManager and cMySQLOutputScalarManager provided in samples/database fulfill this requirement.

However, before worrying too much about rounding and conversion errors, consider the real accuracy of your results:

In real life, it is very difficult to measure quantities (weight, distance, even time) with more than a few digits of precision. What precision are your input data? For example, if you approximate inter-arrival time as exponential(0.153) when the mean is really 0.152601... and the distribution is not even exactly exponential, you are already starting out with a bigger error than rounding can cause.
The simulation model is itself an approximation of real life. How much error do the (known and unknown) simplifications cause in the results?

12.3 The OMNeT++ Result File Format ¶

By default, OMNeT++ saves simulation results into textual, line-oriented files. The advantage of a text-based, line-oriented format is that it is very accessible and easy to parse with a wide range of tools and languages, and still provides enough flexibility to be able to represent the data it needs to (in contrast to e.g. CSV). This section provides an overview of these file formats (output vector and output scalar files); the precise specification is available in the Appendix ([27]). By default, each file contains data from one run only.

Result files start with a header that contains several attributes of the simulation run: a reasonably globally unique run ID, the network NED type name, the experiment-measurement-replication labels, the values of iteration variables and the repetition counter, the date and time, the host name, the process id of the simulation, random number seeds, configuration options, and so on. These data can be useful during result processing, and increase the reproducibility of the results.

Vectors are recorded into a separate file for practical reasons: vector data usually consume several magnitudes more disk space than scalars.

12.3.1 Output Vector Files ¶

All output vectors from a simulation run are recorded into the same file. The following sections describe the format of the file, and how to process it.

An example file fragment (without header):

...
vector 1   net.host[12]  responseTime  TV
1  12.895  2355.66
1  14.126  4577.66664666
vector 2   net.router[9].ppp[0] queueLength  TV
2  16.960  2
1  23.086  2355.66666666
2  24.026  8
...

There two types of lines: vector declaration lines (beginning with the word vector), and data lines. A vector declaration line introduces a new output vector, and its columns are: vector Id, module of creation, name of cOutVector object, and multiplicity (usually 1). Actual data recorded in this vector are on data lines which begin with the vector Id. Further columns on data lines are the simulation time and the recorded value.

Since OMNeT++ 4.0, vector data are recorded into the file clustered by output vectors, which, combined with index files, allows much more efficient processing. Using the index file, tools can extract particular vectors by reading only those parts of the file where the desired data are located, and do not need to scan through the whole file linearly.

12.3.2 Scalar Result Files ¶

Fragment of an output scalar file (without header):

...
scalar "lan.hostA.mac" "frames sent"  99
scalar "lan.hostA.mac" "frames rcvd"  3088
scalar "lan.hostA.mac" "bytes sent"   64869
scalar "lan.hostA.mac" "bytes rcvd"   3529448
...

Every scalar generates one scalar line in the file.

Statistics objects (cStatistic subclasses such as cStdDev) generate several lines: mean, standard deviation, etc.

12.4 SQLite Result Files ¶

Starting from version 5.1, OMNeT++ contains experimental support for saving simulation results into SQLite database files. The perceived advantage of SQLite is existing support in many existing tools and languages (no need to write custom parsers), and being able to use the power of the SQL language for queries. The latter is very useful for processing scalar results, and less so for vectors and histograms.

To let a simulation record its results in SQLite format, add the following configuration options to its omnetpp.ini:

outputvectormanager-class="omnetpp::envir::SqliteOutputVectorManager"
outputscalarmanager-class="omnetpp::envir::SqliteOutputScalarManager"

NOTE

PREFER_SQLITE_RESULT_FILES=yes

configure.user

./configure

make

The SQLite result files will be created with the same names as textual result files. The two formats also store exactly the same data, only in a different way (there is one-to-one correspondence between them.) The Simulation IDE and scavetool also understand both formats.

HINT

The database schema can be found in Appendix [27].

12.5 Scavetool ¶

OMNeT++'s scavetool is a command-line program for exploring, filtering and processing of result files, and exporting the result in formats digestible by other tools.

12.5.1 Commands ¶

scavetool's functionality is grouped under five commands: query, vector, scalar, index, and help.

query: Query the contents of result files. One can list runs, run attributes, result items, unique result names, unique module names, unique configuration names, etc. One can filter for result types (scalar/vector/histogram), and by run, module name, result name and value, using match expressions. There are various options controlling the format of the output (group-by-runs; grep-friendly; suppress labels; several modes for identifying the run in the output, etc.)
vector: Filter and export vector data. One can filter by run, module name and vector name, using match expressions. It is also possible to apply processing steps, i.e. a sliding window average or batch average filter. The list of available processing filters can be found in the help of the tool. Several output formats are available: vector file (default), CSV file, GNU Octave text file, MATLAB script.
scalar: Filter and export scalar data. One can filter by run, module name, scalar name and value, using match expressions. In the output, scalars are organized in a table. The table can be organized to contain Module, Name and Value columns, or to have each scalar in its own column (and data from one run in each row). Other groupings are also possible. Several output formats are available: CSV file, GNU Octave text file, MATLAB script.
index: Generate index files (.vci) for vector files. Note that this command is usually not needed, as other scavetool commands automatically create vector file indices if they are missing or out of date (unless indexing is explicitly disabled.) This command can also be used to rebuild a vector file so that data are clustered by vectors for more efficient access.
help: Prints help. The synopsys is scavetool help <topic>, where any command name can be used as topic, plus there are additional ones like patterns or filters. scavetool <command> -h also works.

The default command is query, so its name may be omitted on the command line.

12.5.2 Examples ¶

The following example prints a one-line summary about the contents of result files in the currect directory:

$ scavetool *.sca *.vec
runs: 459   scalars: 4120   vectors: 7235   histograms: 916

Listing all results is possible with -l:

$ scavetool -l *.sca *.vec
PureAlohaExperiment-439-20161216-18:56:20-27607:

scalar Aloha.server  duration              26.3156
scalar Aloha.server  collisionLength:mean  0.139814
vector Aloha.host[0] radioState:vector vectorId=2 count=3 mean=0.33 ..
vector Aloha.host[1] radioState:vector vectorId=3 count=9 mean=0.44 ..
vector Aloha.host[2] radioState:vector vectorId=4 count=5 mean=0.44 ..
...

To export all scalars in CSV, use the following command:

$ scavetool scalar -g module,name -F csv -O x.csv *.sca

The following example writes the window-averaged queuing times stored in in.vec into out.vec:

$ scavetool vector -p "queuing time" -a winavg(10) -O out.vec in.vec

The next example writes the queueing and transmission times of sink modules into CSV files. It generates a separate file for each vector, named out-1.csv, out-2.csv, etc.

$ scavetool vector -p 'module(**.sink) AND ("queueing time" OR "tx time")'
    -O out.csv -F csv in.vec

12.6 Result Analysis ¶

We recommend the following routes for the analysis of simulation results:

Use the Simulation IDE for casual analysis, i.e. browsing data and quick plotting.
Use Python (or R) programs for sophisticated analysis and for producing tailored reports.

Of courses, many other approaches are also possible, some of which are also described in later sections. In all cases, scavetool can be used to filter and export simulation results in a format understood by other tools, for example CSV.

12.6.1 The Analysis Tool in the Simulation IDE ¶

The Simulation IDE provides an Analysis Tool for the analysis and visualization of simulation results. The Analysis Tool lets you load several result files at once, and presents their contents somewhat like a database. You can browse the results, select the particular data you are interested in (scalars, vectors, histograms), apply processing steps, and display them in charts or plots. Data selection, processing and charting steps can be freely combined, resulting in a high degree of freedom. These steps are grouped into and stored as "recipes", which get automatically re-applied when new result files are added or existing files are replaced. This automation spares the user lots of repetitive manual work, without resorting to scripting.

The Analysis Tool is covered in detail in the User Guide.

12.6.2 Spreadsheets ¶

Spreadsheets such as Excel or LibreOffice's Calc, with their charting and statistical features, offer an obvious alternative to the IDE's Analysis Tool. Spreadsheets are primarily good for analyzing scalar results, and less so for vectors and histograms. A commonly available feature called pivot table can be especially useful.

Spreadsheets cannot open result files directly. An obvious solution is to use scavetool (see [12.5]) to export data in CSV (comma-separated vector) format, and open that in the spreadsheet program. scavetool can also help you merge data from several result files into a single CSV, and to filter the results to reduce the amount of data to be loaded into the spreadsheet.

One drawback of using spreadsheet programs is the manual work associated with preparing and reloading data every time simulations are re-run.

12.6.3 Using Python for Result Analysis ¶

Python is a popular general-purpose programming language with a large ecosytem. It is widely used in the open source community and also commercially in a great number of application areas. Python is also among the most popular languages for teaching introductory computer science courses at universities.

In the recent years, Python has emerged as great platform for numerical computing and statistics due to the appearance of powerful extension packages like NumPy and Pandas. Some of the Python packages useful for our purposes:

NumPy and SciPy are numerical and scientific computing packages.
PANDAS is a data analysis package for Python. The design of Pandas has been heavily influenced by R's data frames.
MatPlotlib is a plotting library. MatPlotlib provides a “pylab” API designed to closely resemble that of MATLAB, thereby making it easy to learn for experienced MATLAB users.
The sqlite3 package makes it possible to access SQLite3 databases from Python. Its significance is that it allows for working with OMNeT++ SQLite result files (see [12.4]).

12.6.4 Using Other Software ¶

12.6.4.1 GNU R ¶

R is a free software environment for statistical computing and graphics. R is widely used for statistical software development and data analysis. The program uses a command line interface, though several graphical user interfaces are available. It has powerful plotting capabilities, and it is supported on all major operating systems and platforms.

HINT

https://github.com/omnetpp/omnetpp-resultfiles/wiki

Some other OMNeT++-related packages such as SimProcTC and Syntony also build on R.

12.6.4.2 MATLAB, GNU Octave ¶

MATLAB is a commercial numerical computing environment and programming language. MATLAB allows easy matrix manipulation, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs in other languages.

GNU Octave is an MATLAB-like software environment, available on nearly all platforms. Octave is free software.

12.6.4.3 Gnuplot ¶

Gnuplot is a simple but popular program for creating two- and three-dimensional plots of functions and data. The program runs on all major platforms.

Gnuplot has an interactive command interface. For example, given two data files foo.csv and bar.csv that contain two values per line (x y; such files can be exported with scavetool from vector files), you can plot them in the same graph by typing:

plot "foo.csv" with lines, "bar.csv" with lines

Several commands are available to adjust the x/y ranges, plotting style, labels, scaling. The plot can be exported in various image formats.

[Prev] [Next] [TOC] [Chapters]