Creating a RiskScape project
Before we start
This tutorial is aimed at new users who want to start creating their own projects. Projects need to be setup first, before you can build and run your own risk models in RiskScape. We expect that you:
Have completed the How to build RiskScape models guide and are familiar with building and running a RiskScape model.
Have a basic understanding of geospatial data and risk analysis.
Have some basic Python knowledge, or a willingness to learn.
The aim of this tutorial is to get you familiar with creating projects, bookmarks, and functions in RiskScape, so that you can build models on your own.
Getting started
Setup
Click here to download the example project we will use in this guide.
Unzip the file into the Top-level Windows project directory where you keep your RiskScape projects.
For example, if your top-level projects directory is C:\RiskScape_Projects\
,
then your unzipped directory will be C:\RiskScape_Projects\project-tutorial
.
Open a command prompt and cd
to the directory where you unzipped the files, e.g.
cd project-tutorial
You will use this command prompt to run the RiskScape commands in this tutorial.
The unzipped project contains a few sub-directories:
project-tutorial\data
contains the input data files we will use in this tutorial. This data is similar to the Upolu tsunami data that we used in the previous tutorials.project-tutorial\functions
contains Python files we will import as RiskScape functions.project-tutorial\models
contains some pre-built models we will use to test our project as we go along.
Note
This input data was provided by NIWA, as well as the PCRAFI (Pacific Risk Information System) website. The data files have been adapted slightly for this tutorial.
There is also an initial project-tutorial\project.ini
file that we will modify.
Open this project.ini
file in Notepad (or your preferred text editor).
Background
Project INI files
You may have noticed from previous tutorials that RiskScape gets all its configuration
information from a project.ini
file.
This tells RiskScape things like what models can be run, and what input data should be used in the models.
The project.ini
file is in the INI format
and can be modified in any plain-text editor, such as Notepad or gedit
.
INI files contain key-value pairs, which are organized into sections. Square brackets are used to indicate the start of a section. A simple INI section might look something like this:
[section my-id]
key-one = some value
key-two = 2.0
In the previous tutorials, we have used INI files to save our model’s parameters. For example:
[model basic-exposure]
description = Simple example of a RiskScape model
framework = wizard
input-exposures.layer = data/Buildings_SE_Upolu.shp
input-hazards.layer = data/MaxEnv_All_Scenarios_50m.tif
sample.hazards-by = CLOSEST
analysis.function = is_exposed
Here the section starts with model
, indicating that we are defining a RiskScape model,
followed by the ID of the model (basic-exposure
).
The lines that follow store the settings for the model’s parameters as key-value pairs.
In addition to models, the project.ini
also stores details about what goes into the models.
These are:
The input data files to use, which RiskScape calls bookmarks.
Python functions that will determine the impact the hazard has on each element-at-risk.
We will look into how to configure each of these in more detail.
Tip
The idea behind the project.ini
file is that it provides a way to organize your RiskScape models,
much like a work-space, so that you can keep related models (i.e. ones that use similar data or functions) together.
Completely unrelated models can go in a separate project.ini
file in another directory.
Bookmarks
A RiskScape bookmark identifies a file that can be used as an input layer in a model. Imagine your file system is a book - your bookmarks tell RiskScape what to use and how to use it.
A simple bookmark
Let’s look at a simple example. Add the following to your project.ini
file.
[bookmark Samoa_electoral_boundaries]
location = data/Samoa_constituencies.shp
Each RiskScape bookmark has an ID, which is the text that follows [bookmark ...]
.
In this case, the bookmark ID is Samoa_electoral_boundaries
.
All RiskScape bookmarks must also have a location
, which specifies the input data to read.
Note
Your bookmark’s ID can contain spaces, e.g. [bookmark cool file]
.
However, this makes some RiskScape commands slightly harder to use.
You will need to enclose the bookmark ID in double-quotes when you use it on the command line, e.g.
riskscape bookmark info "cool file"
Save the project.ini
file and enter the following command in your terminal to check that RiskScape now knows
about the new bookmark.
riskscape bookmark list
You should see output similar to the following:
+--------------------------+-----------+----------------------------------------------------------------------------+
|id |description|location |
+--------------------------+-----------+----------------------------------------------------------------------------+
|Samoa_electoral_boundaries| |file:///C:/RiskScape_Projects/project-tutorial/data/Samoa_constituencies.shp|
+--------------------------+-----------+----------------------------------------------------------------------------+
Tip
You can add an optional description
key for most things in the project.ini
file.
The description is purely to help you keep track of what each model/bookmark/function does.
You can also add comments to the INI file by using #
at the start of the line.
Enter the following command into your terminal to view more detailed information about the bookmark:
riskscape bookmark info Samoa_electoral_boundaries
You should see output similar to the following:
"Samoa_electoral_boundaries"
Description :
Location : file:///C:/RiskScape_Projects/project-tutorial/data/Samoa_constituencies.shp
Attributes :
the_geom[MultiPolygon[crs=EPSG:4326]]
fid[Integer]
NAME_1[Text]
Region[Text]
Axis-order : long,lat / X,Y / Easting,Northing
CRS code : EPSG:4326
CRS (full) : GEOGCS["WGS 84",
DATUM["World Geodetic System 1984",
SPHEROID["WGS 84", 6378137.0, 298.257223563, AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich", 0.0, AUTHORITY["EPSG","8901"]],
UNIT["degree", 0.017453292519943295],
AXIS["Geodetic longitude", EAST],
AXIS["Geodetic latitude", NORTH],
AUTHORITY["EPSG","4326"]]
Summarizing...
Row count : 43
Bounds : EPSG:4326 [-172.8041 : -171.3977 East, -14.0772 : -13.4398 North] (original)
This output is quite technical-looking, but it tells us a few useful things:
The attributes that are present in the input data, i.e.
the_geom
,fid
,NAME_1
, andRegion
. It also shows us what type of data each attribute holds, e.g.fid
is anInteger
whereasNAME_1
is aText
string.The Coordinate Reference System (CRS) (i.e. EPSG:4326 or WGS 84) and axis-order (i.e.
long,lat
) of the geometry.The number of rows of data the file holds (i.e.
Row count
).The geographic bounds of the data.
Note
The CRS is important part of the input data, which we will learn more about.
Conveniently, the CRS information for shapefiles is already all defined in a .prj
file (i.e. Samoa_constituencies.prj
),
so we don’t have to worry about specifying a CRS for the bookmark.
Manipulating the input data
The main benefit of bookmarks is that they tell RiskScape how to load the data into the model.
When you are working with Shapefiles, GeoTIFFs, and ESRI Grid (i.e. .asc
) files,
most of what RiskScape needs to know is already encoded into the file format.
However, even with these file formats, bookmarks still allow you to manipulate the input data in useful ways.
Let’s look at a example of this in action.
Your project comes with a exposure-by-region
model, which is already defined in the models/models_exposure-by-region.ini
file:
[model exposure-by-region]
framework = wizard
description = Produces a total count of buildings in each region exposed to tsunami inundation
input-exposures.layer = data/Buildings_SE_Upolu.shp
input-exposures.geoprocess = false
input-hazards.layer = data/MaxEnv_All_Scenarios_50m.tif
input-areas.layer = Samoa_electoral_boundaries
input-areas.geoprocess = false
sample.hazards-by = CLOSEST
analysis.function = is_exposed
report-event-impact.filter = consequence = 1
report-event-impact.group-by[0] = area
report-event-impact.aggregate[0] = count(*) as Exposed_buildings
report-event-impact.select[0] = area.Region as Region
report-event-impact.select[1] = Exposed_buildings
This model counts the number of exposed buildings by region (using our Samoa_electoral_boundaries
bookmark as the area-layer),
similar to models we have used in previous tutorials.
Run this model now, by entering the following command:
riskscape model run exposure-by-region
It should produce a output/exposure-by-region/TIMESTAMP/event-impact.csv
results file,
where TIMESTAMP
is the current date/time, e.g. 2022-01-13T17_38_2
.
We can use the more "FILENAME"
command to quickly look at a text file’s contents from the terminal,
such as the event-impact.csv
file produced here, e.g.
more "output/exposure-by-region/TIMESTAMP/event-impact.csv"
Tip
Forward slashes in file-paths generally work OK in the Windows Command Prompt, as long as
you surround them in double-quotes, e.g. "output/some-file.csv"
.
This means you can copy-paste the results filename from the URI that RiskScape displays.
Simply select the text and use Ctrl
+ c
and Ctrl
+ v
to copy-paste in the Windows
Command Prompt.
The event-impact.csv
file should contain the following:
Region,Exposed_buildings
,10
Aleipata Itupa i Lalo,526
Aleipata Itupa i Luga,340
Falealili,749
Lepa,288
Lotofaga,146
Now let’s say we wanted a slightly different regional breakdown of the results. The area-layer is just a parameter to the model, so RiskScape will let us replace the parameter with a different file.
Try running the following command to use the data/ws_districts.shp
file as our area-layer.
riskscape model run exposure-by-region -p "input-areas.layer=data/ws_districts.shp"
This time, instead of running our model, RiskScape gives us an error:
There was a problem with the parameters for wizard model
- Failed to load the saved model. Some parameters specified may be invalid. If you have
altered parameters manually, try going through the interactive wizard again
- Problems found with 'report-event-impact.select' parameter
- Failed to validate 'select({area.Region as Region, Exposed_buildings})' step ...
- Failed to validate expression '{area.Region as Region, Exposed_buildings}' ...
- Could not find 'area.Region' among [area.the_geom, area.fid, area.District, Exposed_buildings]
Troubleshooting RiskScape errors
RiskScape errors are often nested like this. The top problem describes the high-level operation that failed, and the subsequent problems then drill-down into more and more specific context about what went wrong.
Let’s look at these errors in more detail and try to work out what went wrong:
The first error tell us there was a problem loading the saved model, possibly related to the model parameters that we used.
The next error says the problem was specifically with the
report-event-impact.select
parameter. We didn’t actually change that parameter at all. In our model, that parameter looks like this:report-event-impact.select[0] = area.Region as Region
The next two errors specify the pipeline step and expression that failed. We will learn more about these concepts in subsequent tutorials.
The final error tells us that the
area.Region
does not exist. Only thearea.the_geom
,area.fid
, andarea.District
attributes are present in the model.
So, what went wrong?
The attributes that are available in a RiskScape model depend on what input data the model uses.
In this case, it appears that our original area-layer has a Region
attribute,
but our new area-layer does not.
Let’s confirm this by taking a closer look at our new area-layer. Enter the following command:
riskscape bookmark info "data/ws_districts.shp"
You can see from the output that the file does not contain a Region
attribute,
although it does have a District
attribute instead, i.e.
Location : file:///C:/RiskScape_Projects/project-tutorial/data/data/ws_districts.shp
Attributes :
the_geom[MultiPolygon[crs=EPSG:4326]]
fid[Integer]
District[Text]
...
Tip
In many cases, bookmarks and file paths can be used interchangeably in RiskScape.
For example, here we passed a file path directly to the riskscape bookmark info
command.
This means you can use file paths as model parameters without necessarily creating bookmarks.
Consistent input data
In order to reuse the same model with different input files,
some attributes in the input data (in this case, the Region
attribute) will need to be consistent across the files.
The naive approach would be to manually rename the attribute in the input data, and re-save the shapefile. However, this can be cumbersome and error-prone if you need to do it often.
RiskScape bookmarks can solve the problem for us.
Let’s create a new bookmark for this second area-layer shapefile.
Add the following to your project.ini
file and save it.
[bookmark Samoa_districts]
location = data/ws_districts.shp
set-attribute.Region = District
The last line is setting a new attribute called Region
, which will hold whatever value is in the District
attribute.
Enter the following command to see what the bookmark data looks like now:
riskscape bookmark info Samoa_districts
You should see that there is now a new Region
attribute in the output.
The original District
attribute is still also present.
"Samoa_districts"
Description :
Location : file:///C:/RiskScape_Projects/project-tutorial/data/data/ws_districts.shp
Attributes :
the_geom[MultiPolygon[crs=EPSG:4326]]
fid[Integer]
District[Text]
Region[Text]
...
Now enter the following command to use our new bookmark in the model.
riskscape model run exposure-by-region -p "input-areas.layer=Samoa_districts"
This time the model runs successfully because all the attributes it needs are present in the input data.
Note
In this case we simply copied an existing attribute in the input data,
but you can manipulate the data in more complicated ways.
For example, you could convert imperial units into the metric system using:
set-attribute.metres = feet / 3.281
Filtering
Let’s just take a quick look at the event-impact.csv
results file that the last riskscape model run
command produced.
Use more "output/MODEL/TIMESTAMP/event-impact.csv"
to look at the results, e.g.
more "output/exposure-by-region/2022-01-13T17_38_25/event-impact.csv"
Region,Exposed_buildings
Aleipata Itupa i Lalo,507
Aleipata Itupa i Luga,339
Falealili,749
Lepa,283
Lotofaga,146
Marine Area,35
If you look carefully, you will notice there is a ‘Marine Area’ region now present in the results. Our model now thinks some buildings are located in the sea, which is not ideal.
Often area-layer shapefiles will contain polygons that denote bodies of water, however, we generally want to ignore these areas in our model.
Bookmarks also let us filter the input data so that only certain rows of data are included in the model. We can specify a true/false condition, and only input data that satisfies that condition will be used in the model.
In your project.ini
file, add the following line to your Samoa_districts
bookmark, and save the file.
filter = Region != 'Marine Area'
Your bookmark should now look like this:
[bookmark Samoa_districts]
location = data/ws_districts.shp
set-attribute.Region = District
filter = Region != 'Marine Area'
Note
We are using a !=
condition here, because we want to exclude a specific row of data,
i.e. include everything except the ‘Marine Area’ row of data.
Now try using the updated area-layer bookmark in your model by running the following command:
riskscape model run exposure-by-region -p "input-areas.layer=Samoa_districts"
Take a look at the event-impact.csv
file that the model produces. It should look like this:
more "output/exposure-by-region/2022-01-13T18_05_00/event-impact.csv"
Region,Exposed_buildings
,19
Aleipata Itupa i Lalo,518
Aleipata Itupa i Luga,341
Falealili,749
Lepa,286
Lotofaga,146
The ‘Marine Area’ is no longer present in the results, although we do have 19 buildings that were not matched to any region now.
If you look carefully, you will notice that 35 buildings were previously matched to the ‘Marine Area’, but now only 19 buildings have no region. This is because some buildings (16) were straddling a regional boundary.
We use ‘closest’ spatially matching for the area-layer. When a building intersects two regions, we assign it to the region that’s closest to the building’s centroid. When we removed the ‘Marine Area’, it meant that 16 buildings now only intersected one region instead of two.
We could potentially use the sample.areas-buffer
model parameter here to assign all buildings to a region,
like we did in the previous tutorial.
Tip
The bookmark filter
parameter essentially works the same as the ‘filter’ geoprocessing option in the wizard.
Using the wizard can make it easier to build filter expressions.
Problematic input data
Dealing with real world data can sometimes be a little messy. Let’s look at some examples of how RiskScape deals with problematic data.
In the data/
sub-directory, there is also a problematic.shp
file.
Try run the following command to use it as the model’s area-layer.
riskscape model run exposure-by-region -p "input-areas.layer=data/problematic.shp"
You should see an error message like this:
15:29:14.642 [main] WARN n.o.r.e.d.r.FeatureSourceBookmarkResolver - No crs could be parsed
for feature source from file:///C:/RiskScape_Projects/project-tutorial/data/problematic.shp,
falling back to generic 2d
There was a problem with the parameters for wizard model
- Could not apply the answer to the 'input-areas.layer' parameter to your model
- The given Geom type does not contain the required spatial meta-data (i.e. CRS). This
could be because the input data comes from a CSV file and 'crs-name' needs to be set
in the bookmark
The error tells us that RiskScape could not read the CRS information for this shapefile.
If you look closely at the data/
sub-directory, you will see that the .prj
file
that contains all the shapefile’s CRS information is actually missing, i.e. there is no problematic.prj
file.
Tip
In Windows Command Prompt, you can use the dir
command to get a list of any
matching files in a directory, e.g. dir data\problematic.prj
Let’s try doing what the error suggests and create a bookmark with crs-name
set.
We know the CRS for this file should be EPSG:4326, or WGS 84,
so add the following to your project.ini
file and save it.
[bookmark problematic]
location = data/problematic.shp
crs-name = EPSG:4326
Now, try running the following command to use the new bookmark in the model:
riskscape model run exposure-by-region -p "input-areas.layer=problematic"
This time the model runs to completion. However, we still see some warnings about invalid input data displayed:
WARNING: An invalid row of input data has been skipped
- An invalid geometry which cannot be fixed automatically has been detected. Caused by:
Invalid Coordinate at or near point (NaN, -172.03240134903). Refer to the Geometry
reference in the RiskScape documentation for tips on how to avoid this. The row
containing this geometry was: {fid=999, Region=Bad geo…}
WARNING: Problems found with 'problematic' bookmark in location
file:///C:/RiskScape_Projects/project-tutorial/data/problematic.shp
- Invalid geometry has been detected and fixed automatically. Refer to the Geometry
reference in the RiskScape documentation for tips on how to avoid this. The record
containing this geometry was: {fid=1, Region=Marine …}
These warnings tell us that RiskScape encountered invalid geometry in the input data.
The first message tells us that a row of input data was skipped because it contained invalid geometry. This means that this particular row of input data was omitted from our model.
The second message also deals with invalid geometry, but this time RiskScape fixed the geometry for us and continued to use it in the model.
Note
Under the Reference Guides in RiskScape’s documentation, there is a page on Geometry that contains more details about Invalid geometry.
If you wanted to, you can control what RiskScape does in these situations using bookmark parameters:
The
skip-invalid
bookmark parameter determines what RiskScape should do when an invalid row of input data is detected. By default, the invalid row is simply skipped and RiskScape continues, but this can be changed so that theriskscape model run
command stops with an error by usingskip-invalid = false
.validate-geometry
controls whether or not RiskScape validates geometry and attempts to fix it.
Tip
The default bookmark settings should be sufficient for most modelling, so you shouldn’t need to worry too much about changing these bookmark parameters.
Using CSV data
Let’s try another bookmark example. This time we will replace the model’s exposure-layer.
We have a data/Buildings_SE_Upolu_centroids.csv
Comma Separated Values (CSV) file that contains building centroid data for south-eastern Upolu.
If you use the more
command to look at this file, it contains data that looks like the following:
more "data/Buildings_SE_Upolu_centroids.csv"
WKT,ID,Use_Cat,Cons_Frame
POINT (422324.1392684035 8450527.521981074),1360,Outbuilding,Masonry
POINT (422192.23654263915 8450396.489492511),1361,Residential,Masonry
POINT (422204.39138965635 8450380.92939743),1362,Outbuilding,Masonry
POINT (422208.9813466044 8450102.043773355),1607,Residential,Masonry
POINT (422219.40361522196 8450115.30060319),1608,Residential,Masonry
...
Note
The first column of this CSV file contains a WKT
attribute that stores geometry information in Well-Known Text (WKT) format.
Try using this CSV file in the model using the following command:
riskscape model run exposure-by-region -p "input-exposures.layer=data/Buildings_SE_Upolu_centroids.csv"
You should see the following error this time:
There was a problem with the parameters for wizard model
- Could not apply the answer to the 'input-exposures.layer' parameter to your model
- Geometry attribute required but none found in {WKT=>Text,
ID=>Text, Use_Cat=>Text, Cons_Frame=>Text}
Each input layer in the RiskScape model needs to contain some form of geometry, but RiskScape couldn’t find any geometry in our exposure-layer input data.
Let’s take a look at the attributes that this CSV file contains by running the following command:
riskscape bookmark info "data/Buildings_SE_Upolu_centroids.csv"
It should produce the following output:
Location : file:///C:/RiskScape_Projects/project-tutorial/data/Buildings_SE_Upolu_centroids.csv
Attributes :
WKT[Text]
ID[Text]
Use_Cat[Text]
Cons_Frame[Text]
Summarizing...
Row count : 6260
Each attribute in this output has a name as well as a data type, which is in the square brackets.
So RiskScape can see the WKT
attribute in the input data, but it has a Text
string type
rather than a Geometry
type, which is what RiskScape needs.
Note
All the data in a RiskScape model has type information associated with it.
With shapefiles, the attribute data types are saved as part of the file format.
However, attributes in a CSV file are always Text
type by default.
Types
We can use the set-attribute
bookmark parameter to change the underlying type of the input data.
Converting CSV attributes into numeric data is pretty simple in RiskScape. It looks similar to using type casts in Python, for example:
# below converts 'year' attribute to an integer (i.e. a whole number)
set-attribute.year = int(year)
# below converts 'cost' into a floating-point number (i.e. with a decimal place)
set-attribute.cost = float(cost)
Here, the int(year)
line is an example of a RiskScape expression.
It is actually calling the built-in RiskScape int()
function, which converts a text-string into an integer.
To turn a WKT string into a geometry type, We can use a built-in RiskScape function called geom_from_wkt
.
Try adding the following bookmark to your project.ini
file and then save it.
[bookmark building_centroids_csv]
location = data/Buildings_SE_Upolu_centroids.csv
set-attribute.geom = geom_from_wkt(WKT)
Note
Instead of WKT, sometimes the input data will contain point geometry, where each coordinate
is a separate attribute, e.g. POINT_X
and POINT_Y
.
Instead of geom_from_wkt(WKT)
, you can use the create_point(POINT_X, POINT_Y)
RiskScape function to
turn the individual coordinates into geometry.
Run the following command to use the new bookmark in your model:
riskscape model run exposure-by-region -p "input-exposures.layer=building_centroids_csv"
We still get the following error, but we have seen this problem before.
There was a problem with the parameters for wizard model
- Could not apply the answer to the 'input-exposures.layer' parameter to your model
- The given Geom type does not contain the required spatial meta-data (i.e. CRS). This
could be because the input data comes from a CSV file and 'crs-name' needs to be set
in the bookmark
In this case, we know the geometry data is in the EPSG:32702 CRS.
Add a crs-name = EPSG:32702
line to your bookmark so that it looks like this:
[bookmark building_centroids_csv]
location = data/Buildings_SE_Upolu_centroids.csv
set-attribute.geom = geom_from_wkt(WKT)
crs-name = EPSG:32702
Tip
When you have CSV input data, you will always need to specify the set-attribute.geom
and crs-name
parameters for your bookmark.
Save your project.ini
file and try using the updated bookmark in the ‘model run’ command:
riskscape model run exposure-by-region -p "input-exposures.layer=building_centroids_csv"
This time the model should successfully output a results file.
Note
With CSV data you may also have to specify the axis-order that the CRS is in,
i.e. whether the coordinates are in lat,long
or long,lat
order.
In this case the EPSG:32702 specification defines an easting, northing
(i.e. long,lat
) axis order so we don’t need to specify the axis-order manually.
The Geometry Reference Guide has more details on Axis/Ordinate Order.
Testing your bookmark
RiskScape provides a way to easily see what your input data will look like when it is used in your model. This is particularly useful when dealing with CSV input data, where it is easy to get the CRS axis ordering wrong.
Using the riskscape bookmark evaluate BOOKMARK_NAME
command will produce a shapefile that contains all the changes that your bookmark applies to the input data.
This shapefile can then be easily viewed in your preferred GIS application.
You can try this yourself using the building_centroids_csv
bookmark in the project.ini
file.
riskscape bookmark evaluate building_centroids_csv
Bookmark formats
How RiskScape loads input data depends on the file format that the data is in.
In our bookmark examples so far, RiskScape has determined the file format based on the file extension.
However, we can use the format
parameter to specify explicitly what file format the data is in.
Try adding the following bookmark to your project.ini
file and save it.
[bookmark Te_Araroa]
description = An online map of the Te Araroa trail, NZ
location = https://opendata.arcgis.com/api/v3/datasets/330fe731ff444471a45d88d8b681e53d_0/downloads/data?format=geojson&spatialRefId=4326
format = geojson
This hyperlink points to a map of the Te Araroa walking trail, in GeoJSON format.
RiskScape can download remote data and use it in a model, however, we need to explicitly set the bookmark’s format
in this case.
Check that RiskScape can load the bookmark’s data by running the following command:
riskscape bookmark info Te_Araroa
It should display output similar to the following:
"Te_Araroa"
Description : An online map of the Te Araroa trail, NZ
Location : https://opendata.arcgis.com/api/v3/datasets/330fe731ff444471a45d88d8b681e53d_0/downloads/data?format=geojson&spatialRefId=4326
Attributes :
geometry[Geom[crs=EPSG:4326]]
OBJECTID[Integer]
SEQUENCE[Integer]
STATUS[Text]
LENGTH[Floating]
NAME[Text]
ISLAND[Text]
LEGALSTAT[Text]
complete[Integer]
Notes[Text]
Fromkm[Floating]
Tokm[Floating]
category[Integer]
Cycle[Integer]
walkid[Integer]
mapName[Text]
link[Text]
editor[Text]
create_dt[Text]
last_editor[Text]
last_edit_dt[Text]
SHAPE_Length[Floating]
Axis-order : long,lat / X,Y / Easting,Northing
CRS code : EPSG:4326
CRS (full) : GEOGCS["WGS84",
DATUM["WGS84",
SPHEROID["WGS84", 6378137.0, 298.257223563]],
PRIMEM["Greenwich", 0.0],
UNIT["degree", 0.017453292519943295],
AXIS["Geodetic longitude", EAST],
AXIS["Geodetic latitude", NORTH],
AUTHORITY["Web Map Service CRS","84"]]
Summarizing...
Row count : 482
Bounds : EPSG:4326 [167.8103 : 175.6674 East, -46.6253 : -34.4267 North] (original)
Supported formats
The file format can affect what bookmark parameters RiskScape will accept. For example, a shapefile bookmark will support some parameters that cannot be used with a GeoTIFF bookmark.
To see a list of supported input formats, use the command:
riskscape format list
To see what parameters a particular bookmark format supports, use the command:
riskscape format info FORMAT_NAME
Functions
Besides bookmarks, the other important piece of information that our project.ini
file holds is functions.
Functions are typically written in Python and are used in the Consequence Analysis phase of the model workflow, to determine the impact or consequence that the hazard has on each element-at-risk.
You may recall the following points from the previous tutorial:
In general, RiskScape will call your function for each element-at-risk (i.e. building) in your exposure-layer. If your data contains 6,000 buildings, then your function will get called 6,000 times.
RiskScape will pass your function two values: the element-at-risk and the hazard intensity measure. We call these the function’s arguments.
The function’s return value gets added to the model’s results as the
consequence
attribute.
Tip
If you are new to Python, or find the idea of RiskScape functions a little intimidating, then there is a simple RiskScape Hello, world exercise you could try first.
A simple function
Currently the exposure-by-region
model uses the built-in is_exposed
function.
This returns 1
if the element-at-risk was exposed to any hazard data, and 0
if not.
Let’s try adding our own version of this function that applies a minimum threshold to the hazard intensity value.
In the functions/
sub-directory there is a threshold.py
file that contains the following Python code:
THRESHOLD = 0.1 # metres
def function(building, hazard):
if hazard is None or hazard <= THRESHOLD:
return 0
else:
return 1
Warning
This function is purely for demonstrative purposes and is not based on scientific methodology in any way.
Before we can use this function in our model, we have to tell RiskScape about it in our project.ini
file.
RiskScape needs to know:
where the Python code is located, i.e. its
location
.what types of arguments the function expects, i.e. its
argument-types
.what type of data the function returns, i.e. its
return-type
.
Add the following to your project.ini
file and save it.
[function exceeds_threshold]
description = returns 1 if the hazard value exceeds a pre-determined threshold
location = functions/threshold.py
argument-types = [building: anything, hazard: nullable(floating)]
return-type = integer
The building
argument type here is anything
, which means we can pass any sort of exposure-layer data to our function.
The hazard
argument here is nullable
, which means a hazard intensity measure might not exist for every element-at-risk.
For example, if a building falls outside the hazard bounds, then there will be no hazard intensity measure associated with it.
In these cases our function will still be called, but the hazard
argument will be nothing (None
in Python).
Tip
Using the anything
type as a function argument can be a little inefficient for performance,
but it is a simple way to get started defining your own RiskScape functions.
If your hazard-layer is shapefile data, then you could use the anything
type for it too, e.g. hazard: nullable(anything)
.
Run the following command to check that RiskScape now knows about the function:
riskscape function list
It should display the following:
+------------------+-------------------------------------+------------------------------------+-----------+---------------+
|id |description |arguments |return-type|category |
+------------------+-------------------------------------+------------------------------------+-----------+---------------+
|exceeds_threshold |returns 1 if the hazard value exceeds|[building: Anything, hazard: |Integer |UNASSIGNED |
| |a pre-determined threshold |Nullable[Floating]] | | |
| | | | | |
|is_exposed |Simple function to check if an |[exposure: Anything, hazard: |Integer |RISK_MODELLING |
| |element-at-risk is exposed to the |Nullable[Anything], resource: | | |
| |hazard. Returns 1 if the `hazard` |Nullable[Anything]] | | |
| |argument is present (i.e. not null) | | | |
| |and 0 if not. Useful as a placeholder| | | |
| |function in risk modelling as it | | | |
| |accepts any types for exposure, | | | |
| |hazard and optional resource. | | | |
+------------------+-------------------------------------+------------------------------------+-----------+---------------+
Now try using this new function in your model by running the following command:
riskscape model run exposure-by-region -p "analysis.function=exceeds_threshold"
It should produce a event-impact.csv
file containing the following results.
Region,Exposed_buildings
,10
Aleipata Itupa i Lalo,498
Aleipata Itupa i Luga,318
Falealili,704
Lepa,264
Lotofaga,138
If you look closely, you will see the Exposed_buildings
count is now lower,
as buildings that were exposed to <= 10cm of tsunami inundation are now excluded from the results.
Tip
Using a threshold function like this might be useful for dealing with hazard data such as rainfall, wind-speed, or Peak Ground Acceleration (PGA). For example, a given element-at-risk might be exposed to hazard data, but the hazard intensity might be too small to cause any real damage.
Exposure-layer arguments
The consequence that your Python function produces can vary depending on what you are modelling. The consequence might be:
whether or not the building is exposed to the hazard. This is what we have been modelling so far.
the damage state of the building. This can measure the probability that a building will sustain a given level of damage, such as complete structural collapse.
the resulting loss. This is the cost to repair or replace the building.
The Python function examples we have covered so far have only used the hazard
function argument.
Our functions have all ignored the building data that is coming from the exposure-layer,
but this data will be useful if we want to calculate the damage state or loss for the building.
The exposure-layer data gets passed to the function as a Python dictionary.
If our function argument is called building
, then can access attributes from the exposure-layer using:
value = building['ATTRIBUTE_NAME']
Replace ATTRIBUTE_NAME
with whatever exposure-layer attribute you are interested in, e.g. Use_Cat
, Cons_Frame
, etc.
Remember that you can use the riskscape bookmark info
command to see what attributes are present in your exposure-layer.
Note
You can also access the exposure-layer attributes by using building.get('ATTRIBUTE_NAME')
.
The difference is this approach will return None
if the attribute doesn’t exist in the exposure-layer,
whereas building['ATTRIBUTE_NAME']
will result in a Python KeyError
exception and your model will stop.
Let’s try a simple example of using an exposure-layer attribute.
In the functions/
sub-directory there is a threshold_by_cons.py
file.
It is similar to the threshold.py
function, except it uses a different threshold based on construction type.
def function(building, hazard):
construction = building['Cons_Frame']
if construction == 'Masonry':
threshold = 0.2
else:
threshold = 0.1
if hazard is None or hazard <= threshold:
return 0
else:
return 1
Warning
This function is purely for demonstrative purposes and is not based on scientific methodology in any way.
Add the following to your project.ini
file and save it.
[function threshold_by_construction]
description = simple example of checking the building construction type
location = functions/threshold_by_cons.py
argument-types = [building: anything, hazard: nullable(floating)]
return-type = integer
This definition is very similar to the previous INI file function definition.
We have only changed the function’s name, the .py
file location, and its description.
Tip
We recommend using underscores (_
) rather than hyphens (-
) in your function names.
Now try using this new function in your model by running the following command:
riskscape model run exposure-by-region -p "analysis.function=threshold_by_construction"
It should produce a event-impact.csv
file containing the following results.
Region,Exposed_buildings
,10
Aleipata Itupa i Lalo,476
Aleipata Itupa i Luga,307
Falealili,679
Lepa,262
Lotofaga,126
You can see that the results have changed again to reflect the changed logic in our function.
Returning complex consequences
The consequence
, or return value, of our function can also be made up of several different attributes.
For example, we might want to calculate several different damage states, or return the losses
for building and land damage separately.
In order to do this, our function simply needs to return a Python dictionary.
However, we have to make sure the return-type
in our INI file function definition matches
the return value in our Python code.
In the functions/
sub-directory there is a exposure_level.py
file that contains the following code:
def function(building, hazard_depth):
result = {}
if hazard_depth is None or hazard_depth <= 0:
result['exposed'] = 0
result['level'] = 'N/A'
return result
if hazard_depth > 3.0:
level = 'Exposure >3.0m'
elif hazard_depth > 2.0:
level = 'Exposure >2.0m to <=3.0m'
elif hazard_depth > 1.0:
level = 'Exposure >1.0m to <=2.0m'
else:
level = 'Exposure >0.0m to <=1.0m'
result['exposed'] = 1
result['level'] = level
return result
It returns two attributes:
exposed
: whether or not the building was exposed to the hazard as0
or1
, i.e. an integer.level
: the range of inundation the building falls into, as a text string.
In RiskScape, a set of related attributes is called a Struct.
For example, the RiskScape model holds the building data from the exposure-layer in an exposure
struct.
Add the following to your project.ini
file and save it.
[function exposure_level]
description = example of a function that returns multiple things
location = functions/exposure_level.py
argument-types = [building: anything, hazard: nullable(floating)]
return-type = struct(exposed: integer, level: text)
Notice that the return-type
line looks quite different this time.
We now return a struct
type, which contains two attributes: exposed
(an integer
) and level
(a text
string).
Tip
To see what built-in types are supported by RiskScape (i.e. integer
, text
, etc),
you can use the riskscape type-registry list
command.
Try using this function in a model by running the following command:
riskscape model run group-by-consequence -p "analysis.function=exposure_level"
We are using a different model this time (group-by-consequence
),
which aggregates the results by consequence rather than by region.
It should produces an event-impact.csv
file that contains the following results:
consequence.exposed,consequence.level,Total_buildings
0,N/A,4201
1,Exposure >0.0m to <=1.0m,473
1,Exposure >1.0m to <=2.0m,394
1,Exposure >2.0m to <=3.0m,472
1,Exposure >3.0m,720
Type definitions
When there are many different attributes we want to return, defining a struct
type
for the function’s return-type
can get a little awkward.
To make life easier, we can define our own struct types separately in the project.ini
file.
For example, add the following to your project.ini
file and save it.
[type exposure_result]
type.exposed = integer
type.level = text
This defines a struct
type called exposure_result
, which contains two attributes: exposed
and level
.
We can now use this type by name (i.e. exposure_result
) for any function’s return-type
or argument-types
.
In your project.ini
file, modify the return-type
line for your exposure_level
function definition,
so that it looks like this:
[function exposure_level]
description = example of a function that returns multiple things
location = functions/exposure_level.py
argument-types = [building: anything, hazard: nullable(floating)]
return-type = exposure_result
This function definition will work exactly the same as it did previously. Try it out by running the model command again:
riskscape model run group-by-consequence -p "analysis.function=exposure_level"
Errors in your function
Let’s look at what happens when something goes wrong with our function.
In the functions/
sub-directory there is a bad.py
file.
This tries to access an attribute that isn’t present in our exposure-layer data.
def function(building, hazard):
construction = building['Bad_attribute']
if hazard is None or hazard <= threshold:
return 0
else:
return 1
Add the following to your project.ini
file and save it.
[function bad_function]
description = the exposure-layer attributes do not match what function expects
location = functions/bad.py
argument-types = [building: anything, hazard: nullable(floating)]
return-type = integer
Now try using this new function in your model by running the following command:
riskscape model run exposure-by-region -p "analysis.function=bad_function"
It should produce the following error:
Problems found with wizard model
- Execution of your data processing pipeline failed. The reasons for this follow:
- Failed to evaluate `{*, consequence: map(hazard, hv -> bad_function(exposure, hv))}`
- A problem occurred while executing the function 'bad_function'. Please check
your Python code carefully for the likely cause.
- KeyError: Bad_attribute - File
"file:///C:/RiskScape_Projects/project-tutorial/functions/bad.py", line 2
This message tells us the details of the Python exception that occurred
(KeyError
for Bad_attribute
) and the line number in the Python file that triggered the problem.
This is just an example of what function errors look like in RiskScape.
You don’t have to fix up the bad.py
Python code unless you want to.
Note
You will get this sort of error if you change your exposure-layer and it does not
contain the attributes that your function expects. You can use RiskScape’s type system
to detect this problem, if you specify a struct
for the argument-types
instead of
using anything
.
Case study: damage state functions
The next example looks at how the research paper Evaluating building exposure and economic loss changes after the 2009 South Pacific Tsunami used a RiskScape function to calculate building damage.
This research used a fragility curve to determine the probability
of damage to a building, based on a given tsunami hazard intensity measure.
Five different damage states were used, from light non-structural damage (DS_1
),
through to complete structural collapse (DS_5
).
The RiskScape function uses a log-normal Cumulative Distribution Function (CDF) to determine the conditional probability (between 0 and 1.0) of a building being in a given damage state as a result of the tsunami inundation.
The shape of the log-normal CDF curve will be different depending on the building’s construction material and the damage state being investigated. This means that different mean and standard deviation values will be used to build the log-normal CDF curve.
The Python code looks like this:
def function(building, hazard_depth):
DS_1_Prob = 0.0
DS_2_Prob = 0.0
DS_3_Prob = 0.0
DS_4_Prob = 0.0
DS_5_Prob = 0.0
construction = building["Cons_Frame"]
if hazard_depth is not None and hazard_depth > 0:
DS_1_Prob = log_normal_cdf(hazard_depth, -0.53, 0.46)
if construction in ['Masonry', 'Steel']:
DS_2_Prob = log_normal_cdf(hazard_depth, -0.33, 0.4)
DS_3_Prob = log_normal_cdf(hazard_depth, 0.1, 0.35)
DS_4_Prob = log_normal_cdf(hazard_depth, 0.26, 0.41)
DS_5_Prob = log_normal_cdf(hazard_depth, 0.39, 0.4)
elif construction in ['Reinforced_Concrete', 'Reinforced Concrete']:
DS_2_Prob = log_normal_cdf(hazard_depth, -0.33, 0.4)
DS_3_Prob = log_normal_cdf(hazard_depth, 0.13, 0.56)
DS_4_Prob = log_normal_cdf(hazard_depth, 0.53, 0.54)
DS_5_Prob = log_normal_cdf(hazard_depth, 0.86, 0.94)
else: # 'Timber' or unknown
DS_2_Prob = log_normal_cdf(hazard_depth, -0.33, 0.4)
DS_3_Prob = log_normal_cdf(hazard_depth, 0.06, 0.38)
DS_4_Prob = log_normal_cdf(hazard_depth, 0.1, 0.4)
DS_5_Prob = log_normal_cdf(hazard_depth, 0.1, 0.28)
result = {}
result['DS_1'] = DS_1_Prob
result['DS_2'] = DS_2_Prob
result['DS_3'] = DS_3_Prob
result['DS_4'] = DS_4_Prob
result['DS_5'] = DS_5_Prob
return result
def log_normal_cdf(x, mean, stddev):
# this uses the built-in RiskScape 'lognorm_cdf' function
return functions.get('lognorm_cdf').call(x, mean, stddev)
Note
This function was provided by NIWA and has been refactored and adapted for this tutorial.
There are two things of note about this Python code:
The Python file contains two functions. RiskScape will try to always use the
def function(...
block of Python code.A built-in RiskScape function (
lognorm_cdf
) is used to calculate the log-normal CDF. This is thefunctions.get('lognorm_cdf').call(...
line in the code. You can find out more about this built-in function by entering theriskscape function info lognorm_cdf
command.
Note
Calling a built-in RiskScape function from Python is only possible if you use the Jython Python implementation.
RiskScape Python functions use Jython by default, but you can switch to CPython instead.
CPython is recommended if you want to import packages, such as numpy
or scipy
.
The RiskScape documentation explains more about the difference between Jython vs CPython.
In order to use this function, add the following to your project.ini
file and save it.
[type building]
type.Cons_Frame = text
[type damage_states]
type.DS_1 = floating
type.DS_2 = floating
type.DS_3 = floating
type.DS_4 = floating
type.DS_5 = floating
[function Samoa_Building_Fragility]
description = Samoa tsunami fragility functions for buildings
location = functions/Samoa_Building_Fragility.py
argument-types = [building, hazard: nullable(floating)]
return-type = damage_states
framework = jython
As well as defining the function, this defines types that the function uses for its argument-types
and return-type
.
Note
The building
struct we defined only has one attribute, but our exposure-layer input data has several more attributes.
The argument-types
only need to define the exposure-layer attributes that your function actually uses (Cons_Frame
here).
This will make your functions easier to reuse with different input data.
We also want to import the pre-existing building-fragility
model into our project, which will use the new function.
Go to the top of your project.ini
file add the line models = models/models_building-fragility.ini
to the [project]
section. The [project]
section in your project.ini
file should now look like this:
[project]
description = Initial project file. You will add more bookmarks and functions to it
models = models/models_exposure-by-region.ini
models = models/models_group-by-consequence.ini
models = models/models_building-fragility.ini
...
Try running the model with the following command:
riskscape model run building-fragility
This should produce an event-impact.csv
results file.
Open these results in a spreadsheet application.
The results are aggregated by region.
As well as the total Exposed_buildings
, we can also see a count of how
many buildings have > 0.5 or > 0.9 probability of being in damage state 5 (complete structural collapse).
Some percentiles are also recorded for damage state 5 and for inundation depth.
Recap
Let’s review some of the key points we have covered so far:
The
project.ini
file holds the bookmarks and functions that the model will use.Bookmarks configure the input data that RiskScape models can use.
The attributes in a RiskScape model correspond to the attributes that are present in the input data.
All the data in a RiskScape model has type information associated with it.
Bookmarks let you manipulate the input data before it gets used by the model.
The input data for RiskScape models always needs a geometry-type attribute present and a CRS defined.
File-paths and bookmarks can often be used interchangeably in RiskScape. In particular, shapefiles, GeoTIFFs, ESRI Grid, and GeoJSON files generally have all the information RiskScape needs, such as the CRS, saved as part of the file format.
The
riskscape bookmark info
command is a useful way to find our more about a file or bookmark, such as the attributes the data contains or its CRS.You always need to define a bookmark in order to use CSV input data in a model. The bookmark will need to define
set-attribute.geom
andcrs-name
for the CSV data.RiskScape can do some error-checking on the input data, such as whether the geometry is valid.
You can use the
riskscape format info
command to find out more about what parameters a bookmark supports.RiskScape models use a Python function to determine the impact that the hazard has on each element-at-risk. The function’s return value becomes the
consequence
in the model’s results.The function gets passed the exposure-layer input data, along with the hazard intensity measure. These values are called the function’s arguments.
A set of related attributes (i.e. attributes that come from the same input layer) is called a struct in RiskScape. In your Python function, a struct is simply a Python dictionary.
The hazard function argument is
nullable
. If no hazard intensity measure was determined, then your function will be passed ahazard
value equal toNone
.You can optionally define your own struct types in your
project.ini
. This can make it easier to define your functions. Alternatively, you can useanything
for your function’sargument-types
if you’re not sure what type the data is.If there is a coding error in your Python function, then you will get the Python error reported when you try to use the function in a RiskScape model.
RiskScape uses the Jython Python implementation by default, but you can switch to CPython if you want to use packages like
numpy
orscipy
.
Once you feel comfortable with project files, you could go through Recapping the basics.
Extra for experts
If you want to explore bookmarks and functions a little further, you could try the following exercises out on your own.
Practice adding a
description
to some of the bookmarks you created in theproject.ini
file. Try also using#
to add a few INI file comments.Some buildings are not assigned to any region when you run the
exposure-by-region
model. Try specifying thesample.areas-buffer
parameter when you run the model. See if you can work out the buffer distance needed to assign all buildings to a region. Start off with 100m, 1250m, 500m, 1000m, and so on.Try creating a bookmark for the
data/Building_XY_coords.csv
file and use this as the exposure-layer in theexposure-by-region
model. This file contains separatePOINT_X
,POINT_Y
coordinate attributes for the geometry, so you will have to usecreate_point()
instead ofgeom_from_wkt()
in the bookmark.Try creating a bookmark for the
data/bad-data.csv
file and use this as the exposure-layer in theexposure-by-region
model. It will report warnings that rows are being skipped. See if you can identify the problem in the CSV file and fix it.Try fixing up the
bad_function
/functions/bad.py
code so that it works with theriskscape model run
command.In the
functions/
sub-directory there is abuggy.py
Python file that has a couple of problems with it. Add this function to your project and try using it in theexposure-by-region
model. Look at the Python error that theriskscape model run
command gives you and try to fix it in thebuggy.py
file. Re-run the command until the model runs successfully.Try adding some debug to
buggy.py
Python function. Add the statements below to the Python code and then run the function in theexposure-by-region
model again. Make sure you use the building centroid CSV as the exposure-layer, i.e.-p "input-exposures.layer=building_centroids_csv"
.if building['ID'] == '1000' or building['ID'] == '7000': print("ID: {} Cons_Frame: {} Use_Cat: {} hazard: {}".format(building['ID'], building['Cons_Frame'], building['Use_Cat'], hazard))
.