Multi-file hazard data
Some probabilistic hazard datasets are organized by event, typically with each event stored in its own file. For example, probabilistic flood data may be represented by a series of raster files.
Probabilistic data organized in this way means the RiskScape pipeline processes the hazard data sequentially, event by event.
For example, it is more efficient for RiskScape to process a set of flood GeoTIFFs representing 100 different flooding scenarios one file at a time. Trying to open 100 large GeoTIFFs all at once is going to go very slowly and require a lot of the same data to be read from disk over and over again.
Multiple GeoTIFFs example
If you have a set of GeoTIFFs, each one representing a different event, then you can process them into an event loss table with the following worked example.
To start, we need to create a CSV that enumerates the files and supplies any extra metadata that
belongs to each file (e.g. event id, probability metadata such as an ARI or exceedance probability). The most important
bit, though, is the hazard_file
path to the GeoTIFF file, which is relative to the CSV file.
eventid,hazard_file,probability_metadata
1,maps/flood_01.tif,0.001
2,maps/flood_02.tif,0.045
3,maps/flood_03.tif,0.012
...
We can add this to our RiskScape project.ini
as a relation dataset, which will allow us to process each event map
one by one.
[bookmark flood_maps]
location = flood_maps.csv
# make sure the ID is a number
set-attribute.id = int(id)
# this bit turns the location in to a coverage which we can spatially query
set-attribute.coverage = bookmark(id: hazard_file, options: {}, type: 'coverage(floating)')
With this bookmark, we can now build an event loss table for all the hazard maps in the CSV as though it is a single probabilistic dataset.
You will also need some sort of loss function present in your project.ini
file.
This example will assume the function is called loss_function
, e.g.
[function loss_function]
location = my-project/loss_function.py
argument-types = [building: anything, gmv: floating]
return-type = floating
Then, assuming we also have a buildings
bookmark configured for our exposure-layer data, we can use the following pipeline
to generate the event loss table.
input('flood_maps', name: 'event') as event_input
-> join.lhs
input('buildings', name: 'building') as exposure_input
-> join.rhs
# this combines each building with each map
join(on: true)
->
# each row is now a building and a flood map, we can sample the flood depth
# (`hazard_intensity`) for each building
select({
*,
hazard_intensity: sample_centroid(
geometry: building,
coverage: event.coverage
)
})
->
# we can now compute a loss value for each building
select({
event: event,
hazard_intensity: hazard_intensity,
exposure: exposure.id,
loss: loss_function(building, hazard_intensity)
})
->
# total the exposures by event - this is our event loss table
group(
select: {
event,
sum(loss) as total_loss
},
by: event
)
->
save('event-loss')