RiskScape expression language
RiskScape uses an expression language to allow models to be customized, for example for filtering or aggregating datasets. The expression language is similar to something like a spreadsheet formula or the various bits that make up SQL.
An expression can:
filter a dataset -
building.height_m > 10
compute new values -
round(hazard_intensity * 0.24 * road.replacement_cost)
apply a risk modelling function -
damage_ratio(building, tsunami)
be used to group values for aggregation -
region, risk
be used to apply an aggregation function -
sum(loss)
You can play with expressions using the riskscape expression
command, e.g.
riskscape expression eval 'ceil(1 + 0.7)'
will print 2 to the console
Tip
This page describes the syntax and semantics of the RiskScape expression language. You may prefer to go through the How to write RiskScape expressions tutorial first, to learn about expressions through practical examples.
Language definition
Constants
The language supports declaring various simple constants, such as:
Integer -
456
- mapped internally to Java’s Long type - a 64bit signed integerFloating -
0.212321
- mapped internally to Java’s Double type - a 64bit signed floating point number. Can be entered with scientific notation, e.g5.27e-10
.Text -
'this is some text'
- an arbitrary length string of text, surrounded by single quotes. Single quotes can be inserted in to a string by escaping them with a backslash, e.g.'I don\'t like quotes'
Identifiers
Identifiers are a special kind of string that are used to identify
various objects in an expression. An identifier is any unquoted word that begins with a
letter and contains only letters, numbers and underscores (_
).
Identifiers with other characters, or ones that match keywords (such as and
, or
or as
)
are valid, but must be quoted with double quotes, e.g.
"My interesting thing (from space)"
and "foo:bar"
are valid identifiers.
Depending on how and where identifiers are used, they can:
Identify an attribute on the input data, e.g.
asset.cost
-asset
andcost
are both identifiers. In this example the dot operator.
is used to access the nested attributecost
that belongs toasset
Identify a function -
my_function()
Identify a named argument within a function -
calc_risk(x: 12, mean: 52)
wherex
andmean
are identifiers
Lists
A fixed length list can be declared to create an ordered list of values. Some functions
use lists as input, or you may want to produce a list in your output. A list is declared
in an expression like so - [1, 1, 2, 3, 5, 8, 13]
. Elements are surrounded by square brackets
and separated by commas. Whitespace between the elements is not necessary,
but improves readability.
A list can be filled with anything you like, but the type of the list will depend on what you fill it with. The type of thing inside the list is referred to as the contained type
[1.0, 1.1, 2.1]
has typeList(Floating)
[1, 2.0, 'hello', cost, my_function(1, impact)]
has typeList(Anything)
NB: If any or all elements of the list are nullable, then the the contained type will be nullable.
Tuples
A tuple in RiskScape is an ordered, named, typed list of values. You can think of a tuple as a row
in a database table. Members of tuples in RiskScape are accessed using identifiers, e.g. cost
.
It is worth noting that all expressions are evaluated against a tuple, with the tuple being a particular row
in the dataset that is being evaluated at a particular point. For example:
When setting up a bookmark,
map-attribute
expressions are evaluated against “raw” rows of the dataset being bookmarkedWhen filtering rows, the filter expression is evaluated against whatever data is in the pipeline at that point, e.g.
(asset.cost_dollars > 100000) and (loss.total_loss > 0)
As well as being evaluated by an expression, a tuple can be declared by an expression. Tuple expressions can use
keyword
oras
syntax e.g.{height: asset.height, count: 0, cost_dollars: asset.cost_cents / 100}
{asset.height as height, 0 as count, asset.cost_cents / 100 as cost_dollars}
Functions
As you have seen in the previous examples, expressions can call RiskScape functions by referring to their ID within your project. RiskScape comes by default with some common functions for dealing with numbers, text and geometry.
You can query the built-in functions available using the riskscape function list
command.
To see all functions, use riskscape function list --all
.
To view a particular category of function, such as all maths
functions, use riskscape function list --category maths
.
For more help, see riskscape function list --help
.
An expression can call a function by giving its ID as an identifier, followed by a bracketed list of arguments, e.g. a function that:
Takes some arguments -
min(1, 2)
Takes no arguments -
rand()
Takes named arguments -
norm_cdf(mean: 1.2, stddev: 0.2, x: hazard.intensity)
Takes the result of another function as an argument -
min(1, round(damage_ratio * 10))
Any user-defined functions can be used in your expressions just like the built-in functions. See functions for more information on user-defined RiskScape functions.
Operators
Operators are things like +
, -
and <
. They represent some abstract mathematical operation that depends on the
type of the things being operated on. RiskScape comes with some default operators and rules for applying them to
expressions, but note that these behaviours can be affected by 3rd party plugins. RiskScape only supports binary
operators, that is, operators that apply to two inputs, for example 1 + 2
where 1
and 2
are the inputs and +
is
the operator.
By default, all operators are supported for the number types (Floating and Integer). If these types are mixed,
(e.g. 1 + 2.3
) then the integer is converted to a floating number. If either operand is of nullable type, then the
result of the expression is also nullable. If either operand is null then the expression will return null.
If an operation is not supported for the given input types, then the expression will not be valid and can not be
evaluated, e.g. 1 + [1]
would give an error like Could not find an operator function for operation 'PLUS' for types
'[Integer, List[Integer]]'
Technically an operator is actually a function with a more convenient expression format. The Function Resolution rules all apply to operators.
Operator precedence
RiskScape applies mathematical operators with the following precedence:
bracketed expressions
exponentiation
division
multiplication
addition/subtraction
numeric comparisons (<,>…)
binary comparisons (and/or)
For example the following expressions are equivalent (resulting in: 16.0):
3 * 10 / 5 + 10
((3 * (10 / 5)) + 10
Interaction with the type system
The RiskScape expression language is strongly typed with type inference. That is, each bit of data flowing through your model is associated with a type, and functions and operators will only evaluate if the given arguments can be made to match the types supported by the function or operator. Type inference means that, most of the time, you do not need to state the types of things in your expressions, they are calculated dynamically.
Type inference and realization
When an expression is declared by a user, it is not yet ‘realized’ with any type information - only once that expression is ‘realized’ with an input type does type inference happen and the expression can be checked whether it can be evaluated or not. This realization typically happens when a model is validated, right before execution starts.
As an example, consider the following expression:
1 + (asset.cost)
While we know 1 is an integer, when RiskScape sees this expression, it doesn’t know the type of asset.cost
, nor does
it know if those attributes exist in the dataset we are ultimately going to be evaluating this expression against. When
this expression is brought together in to a model with input data and realized, we infer the type of the expression by
“filling in the gaps” - in the example given this means looking up asset.cost
to see if it exists in the input type,
and then using its type to determine whether plus is supported - more on this in the next sections.
In our example, asset.cost
is an integer, then the expression with its inferred types would look like:
(1):Integer + (asset.cost):Integer
An operator exists for adding two integers, and so realization succeeds.
Function resolution
When a function call is realized in an expression, RiskScape does the following things to resolve the function against the expected argument types:
First, a function is looked up from your project using the identifier, e.g.
my_risk_function(asset)
will look for a function with the IDmy_risk_function
. If none exists, realization will fail.If the arguments to the function match exactly, then the function matches.
If the types that the function takes are
broader
than the function’s arguments (also know as covariance), then the function still matches. For example,Anything
is broader thanText
, andText
is broader thanWithinSet(Text, 'cat', 'dog', 'pig')
. So a function that takes typeAnything
will accept'cat'
as an argument.If the function requires a
Floating
argument and anInteger
is provided, then the function matches and theInteger
argument is converted toFloating
.If there are missing arguments but they are optional (they are nullable), then the function matches.
If any of the given arguments are Nullable, but the function does not accept nullable arguments, then the function matches, but the return type is adjusted to be nullable. If the function is called with missing arguments, then the function won’t apply and nothing is returned.
If none of these apply, the function does not match and won’t be evaluated.
Overloaded functions
Some RiskScape functions are said to be “overloaded” - this means there are multiple versions of the same function that accept a different set of types.
For example, a length
function might be overloaded if it works with
both List
and Text
types. Overloaded functions follow the same function resolution steps, except that
each alternative is checked against the resolution steps listed above in order until one matches.
https://gitlab.catalyst.net.nz/riskscape/riskscape/issues/71
Realizable functions
A realizable function is one that can adapt to the list of argument types it is given to calculate a return type.
The argument types a realizable function advertises are not really used - they exist for documentation reasons only. It is up to each function to attempt to adapt itself to the given input types and return an implementation of the function that best fits the inputs. Once this is done, the function is then matched using the steps above.
https://gitlab.catalyst.net.nz/riskscape/riskscape/issues/71
See Types for a more detailed explanation of RiskScape’s Type system.