Coverage data

A coverage is essentially a spatial lookup table. Coverages are usually grid-based GeoTIFF or ASC raster files.

Coverage data is handled by RiskScape slightly differently to relational data. A relation holds rows or records of data, such as CSV or shapefile vector data.

Coverages are typically used in spatial sampling, which is geospatially matching data in your exposure-layer to the coverage layer. For example, RiskScape can take the geometry of a building footprint from your exposure-layer and use it as a lookup into the hazard-layer coverage. The result it returns is the hazard intensity measure (if any) for that particular building.

Tip

Normally coverages are used for hazard-layers, but you can also use coverage files (i.e. .tif or .asc files) as the exposure-layer in a wizard or pipeline model. This can be useful if your exposure-layer is a population density map, or similar data. Each cell in the coverage will be treated as a polygon square input to your model.

Bookmarks

Setting up a bookmark for a coverage is pretty simple. For example, you could add something like the following to your project.ini file:

[bookmark MY_COOL_NAME]
description = Optionally specify additional details about the data here...
location = MY/COOL/DATA.tif

You can generally use paths to .tif and .asc files directly, without necessarily needing to configure a bookmark.

Tip

If you have lots of similar coverage files that you want to run through the same model, and get a separate set of results output for each coverage, then this is simple to do in RiskScape. Refer to Running the same model repeatedly for more details.

Transform the sampled value

You can apply your own custom transformation to the data returned by RiskScape’s spatial sampling. This can be handy if your coverage data doesn’t match what is expected by the model, for example, if one file’s data is in units of gravity (g) and another file is in log units (log(g)).

In your coverage bookmark, you can specify a simple mapping expression that will modify any values sampled from it. This has the benefit of better model reuse, i.e. you don’t have to create a separate model just because the input data is in a slightly different format.

The following bookmark takes a GeoTIFF file in g units and converts the data into log(g) when it gets used in a model. The value in the expression is the value that was sampled from the GeoTIFF.

[bookmark hazard-data-in-log-units]
location = DATA_IN_G_UNITS.tif
map-value = log(value)

Alternatively, you can use a lambda expression in the bookmark, which makes data value’s identifier clearer and customizable. The following example is equivalent to the previous bookmark, except it uses a lambda expression.

[bookmark the-same-thing-with-lambda]
location = DATA_IN_G_UNITS.tif
map-value = g -> log(g)

In the above example, the lambda argument is called g, but you can call this whatever you want.

Sampling relational data

You can also turn relational data (e.g. shapefile input data) into a coverage that can be used for spatial sampling. This can be useful for matching elements-at-risk to the regional area they are located in, or if your hazard is vector data.

Tip

If you are using the wizard to build a model, then you do not have to worry too much about whether your input data is relational or in coverage form. RiskScape will take care of it all for you.

It can sometimes be useful to be able to use raster data (i.e. GeoTIFFs) or vector data (i.e. shapefiles) interchangeably as input data for your model. The simplest way to do this is to specify that the relational data should be rasterized when you create your bookmark. For example:

[bookmark relation-as-coverage]
location = MY/COOL/DATA.shp
rasterize = true
rasterize-grid-size = 50
rasterize-expression = MY_ATTRIBUTE

When relational data is rasterized, you need to specify:

  1. The grid-size that the coverage should have, in metres. The above example uses a 50m by 50m grid.

  2. An expression for the numeric value to return when the coverage is sampled. Basically, this is the attribute in the shapefile that you are most interested in. It could also be a combination of attributes, e.g. Depth * Velocity.

Advanced RiskScape users can also turn relational data into a coverage directly in RiskScape pipeline code, without using a bookmark. See Sampling a relation for an example.

Note

If there is any overlapping geometry in the input data, then spatial sampling will just arbitrarily pick one of the matching geometries. We recommend ensuring there is not overlapping geometry in your input data.

Nearest-neighbour coverage

A GeoTIFF stores varying hazard intensities across a geospatial grid. Other file formats, in particular NetCDF and HDF5, can represent similar hazard data through a mesh of geospatial points.

For example, you may have PGA shaking intensities or temperature readings across a series of points, or ‘sites’. To determine the hazard intensity for a given element-at-risk, you simply need to find the site that is closest to it.

In this case, you will want to use a nearest neighbour coverage in your model. Normally RiskScape’s spatial sampling will look for intersecting geometry, whereas a nearest neighbour coverage lets us match the closest point, even if it doesn’t intersect directly with our exposure-layer geometry.

Note

Currently nearest neighbour coverages are a feature that is only available in pipelines, and so they are only suitable for advanced users.

You will usually want to specify a cut-off distance in metres for your nearest neighbour coverage. Otherwise, the coverage will always find the closest match, even if it is thousands of miles away.

Determining a suitable cut-off distance is a trade-off between accuracy and performance. If the cut-off is too small, then sampling the coverage might not find any matching data. If the cut-off is too large, then your model may take longer to run, as there will be more potential matches to narrow down.

To build a nearest neighbour coverage, you can specify additional options in the to_coverage() function. For example:

to_coverage(bookmark('YOUR_POINT_BASED_HAZARD'),
            options: { index: 'nearest_neighbour', nearest_neighbour_max_distance: $cutoff_distance }
           ) as nn_coverage

Note

Currently interpolation or attenuation of the hazard intensity measure is not supported, i.e. RiskScape will not find the three closest points and then take the average reading.

.

Optimizing spatial sampling

When RiskScape builds a coverage from polygonal data, it uses an intersection index to store the indexed data. This type of index uses a data structure called an STR tree to efficiently lookup the indexed data by bounding boxes. Once the index has found candidate data, it tests the key data against the candidates using a more computationally expensive intersection method.

Where the indexed data is relatively small and uniform, this index performs well. If the indexed features are large (1000s of coordinates) or have a large bounding box relative to their area, the index starts to slow down. You can optimize the index in this situation by supplying an the intersection_cut option when building the index, for example:

to_coverage(bookmark('YOUR_POLYGONS'), options: {intersection_cut: true})

When set, this option will break apart the given geometries before indexing them to reduce the area they cover and bring down the number of coordinates that need to be checked when computing intersections. For more advanced information, refer to the index code.