Appendix A. Specification of the 'Compute Scalars' operation

Table of Contents

Expressions
Computing Scalars

This appendix describes the Compute Scalars operation in details. Some fields of the operation can contain expressions, so we first describe the expressions. The Compute Scalars operation is described in the following section.

Expressions

An expression is composed of constants, scalar values, vector or histogram fields, variables, operators, and functions.

Constants

Numeric contants are accepted in the standard format.

Examples: -1, 3.14159, 4.2e1

String constants must be enclosed with double quotes. Double quotes within the string must be preceded by a backslash character.

Examples: "xyz", "This string contains a \" character"

Scalars

The values of scalars in the input dataset can be accessed in two ways:

  • Simple names. Using the name of a scalar in the expression refers to all scalar with that name. If the name of the scalar contains non-alphanumeric characters, it must be enclosed with apostrophies.

  • Qualified names. Qualified names are in the form <module>.<scalar>. Here <module> is a pattern (see ...) that matched to the full name of the module of the scalar; <scalar> is the name of the scalar (quoted if needed). As a special case, <module> can also be the full name of the module.

Note that the scalar references may produce multiple values. Consequently the expression that contains them may also produce multiple values by iterating on the values of the scalars. The iteration is defined differently for the two types of scalars references:

  • Scalars referenced by their simple names are restricted to come from the same module. For example, if we have four scalars recorded by two modules, m1.a, m1.b, m2.a, m2.b, then a+b produces two values: m1.a+m1.b and m2.a+m2.b.

  • Scalars referenced by their qualified names are iterated independently on their modules. This means that if there are several such pattern in the expression, than the computation is performed on their cartesian product. With the input of the previous example, *.a+*.b produces four values: m1.a+m1.b, m1.a+m2.b, m2.a+m1.b, and m2.a+m2.b.

    The iteration can be restricted by binding some part of the module name to variables, and use those variables in other patterns. The ${x=<pattern>} syntax in a module name pattern binds the part of the module name matched to pattern to a variable named x. These variables can be referred as ${x} in other patterns. The ${...} syntax allows to write any expression in the pattern (like ${x+1}).

Examples: iaTime, 'sentPk:count', Aloha.server.duration, cli${i={0..2}}.pkSent, **.host[${i+2}].'end-to-end-delay'

Note

Simple scalars references can be viewed as a shortcut for qualified references, where the module part is ${m=**} in the first reference, and ${m} in the subsequent references (here m is a fresh variable name). E.g. if a, and b are scalars, then a*b is equivalent to ${m=**}.a * ${m}.b.

Vectors and Histograms

To refer to a field of a vector or histogram, use count(<name>), mean(<name>), min(<name>), max(<name>), stddev(<name>), and variance(<name>). Here <name> is the simple or qualified name of the vector or histogram.

The same rules apply to qualified names and iterations over modules, as in the case of scalars.

Patterns

Patterns can be used to specify the module name of an input statistic, or as the right-hand-side of the pattern matching operator (=~). Characters of the patterns are matched literally, except the following:

PatternDescription
?matches any character
*matches zero or more characters except '.'
**matches zero or more characters (any character)
{a-z}matches a character in range a-z
{^a-z}matches a character not in range a-z
{32..255}any number (i.e. sequence of digits) in range 32..255 (e.g. "99")
[32..255]any number in square brackets in range 32..255 (e.g. "[99]")
${x=**}matches a pattern, and binds the matched substring to x
${x+1}evaluates an expression, and matches the characters of the result
\takes away the special meaning of the subsequent character

Variables

Variables can be defined by using the ${<var>=...} notation in patterns. The scope of the variable is to the right of its definition. There can also be some predefined variables, and variables defined in one expression can be accessible in another.

Predefined variables

in Grouping expression

module, name, run, run attributes of the current statistics, value if the current statistics is a scalar

in Value expression

group is the value of the grouping expression

in Target module

group is the value of the grouping expression

Operators

The following operators are interpreted as usual:

Arithmetic+ - * / ^ %
Bitwise~ | & # << >>
Concatenation++
Comparision== != < > <= >=
Boolean! || &&
Conditional?:
Pattern matching=~

Arithmetic, bitwise, concatenation, boolean, and comparision operators always evaluate their arguments; binary operators evaluate their left arguments first. Arithmetic, bitwise, and comparision operators implicitly convert their arguments to numeric values, the concatenation operator converts them to strings, the boolean operators converts them to boolean values. A runtime error occurs if the conversion fails.

The conditional operator (cond ? a : b) is special, because it does not evaluate all operands. If the condition is true, then the second, otherwise the third operand is evaluated.

The pattern matching operator =~ expects a string expression as the left, and a pattern as the right operand. If the pattern matches with the string, then the result is the string, otherwise false.

Functions

The following functions compute an aggregated value from a set of values:

FunctionDescription
count(<expr>)count of the values produced by <expr>
sum(<expr>)sum of the values produced by <expr>
min(<expr>)minimum of the values produced by <expr>
max(<expr>)maximum of the values produced by <expr>
mean(<expr>)mean of the values produced by <expr>
stddev(<expr>)standard deviation of the values produced by <expr>
variance(<expr>)variance of the values produced by <expr>

The following functions are not aggregating functions; if their arguments has multiple values, then the applying the function also produces multiple values by iterating on them.

FunctionDescription
Math functions
sin(x), cos(x), tan(x), asin(x), acos(x), atan(x), atan2(x,y), exp(x), log(x), log10(x), sqrt(x), cbrt(x), hypot(x,y), sinh(x), cosh(x), tanh(x), ceil(x), floor(x), round(x), signum(x), min(x,y), max(x,y)These functions produce the same value as the similarly named methods of the java.lang.Math class in Java.
deg(x), rad(x)Conversion from radians to degrees, and from degrees to radians
fabs(x)Absolute value of x
rem(x,y)IEEE floating point remainder of x and y
String functions
length(s)Returns the length of the string.
contains(s, substr)Returns true if string s contains substr as substring.
substring(s, pos, len?)Return the substring of s starting at the given position, either to the end of the string or maximum len characters.
substringBefore(s, substr)Returns the substring of s before the first occurrence of substr, or the empty string if s does not contain substr.
substringAfter(s, substr)Returns the substring of s after the first occurrence of substr, or the empty string if s does not contain substr.
substringBeforeLast(s, substr)Returns the substring of s before the last occurrence of substr, or the empty string if s does not contain substr.
substringAfterLast(s, substr)Returns the substring of s after the last occurrence of substr, or the empty string if s does not contain substr.
startsWith(s, substr)Returns true if s begins with the substring substr.
endsWith(s, substr)Returns true if s ends with the substring substr.
tail(s, len)Returns the last len character of s, or the full s if it is shorter than len characters.
replace(s, substr, repl, startPos?)Replaces all occurrences of substr in s with the string repl. If startPos is given, search begins from position startPos in s.
replaceFirst(s, substr)Replaces the first occurrence of substr in s with the string repl. If startPos is given, search begins from position startPos in s.
trim(s)Discards whitespace from the start and end of s, and returns the result.
indexOf(s, substr)Returns the position of the first occurrence of substring substr in s, or -1 if s does not contain substr.
choose(index, s)Interprets s as a space-separated list, and returns the item at the given index. Negative and out-of-bounds indices cause an error.
toUpper(s)Converts s to all uppercase, and returns the result.
toLower(s)Converts s to all lowercase, and returns the result.
Misc functions 
select(index, ...)Returns the indexth item from the rest of the argument list; numbering starts from 0.
locate(x, ...)Returns the zero-based index of the first argument that is greater than or equal to x. If no such element, then it returns the number of elements (index of last element + 1). Example: locate(42, 0,10,20,50,100) == 3

Implicit conversions

The value of an expression can be a boolean, a double, an integer, or a string. During the evaluation, values are converted to the expected types of functions and operators automatically. The rules of these conversions are:

  • when a double is expected, then string values are parsed; boolean values are converted to 1 (true), or 0 (false)

  • when an integer is expected, then values are converted to double and are rounded

  • when a boolean is expected, then 0 converted to false, anything else to true. String values are first converted to numeric, then to boolean.

  • when a string is expected, then numbers are converted to their decimal notation, booleans are converted to "0" or "1".

Parsing ambiguities

Expressions like a.b*c.d can be parsed as a reference to the d statistic of modules whose names matches the a.b*c pattern, or as the product of a.b and c.d. The expression parser always prefers the first meaning, i.e. * and ? characters are interpreted as part of the pattern. If you want to enter the product or conditional expression, you can add spaces around the operators. Patterns can not contain unquoted spaces, so the parse will be unambigous.

There is another ambiguity that arises from the use of simple names. If you use e.g. s in an expression, it can refer a scalar or a variable. In this case the name is first tried to be resolved as a variable reference, and if it was unsuccessful, then as a statistic name. However quoted names (e.g. 's') always refer to statistics.