Interface PipelineOutputContainer
- All Superinterfaces:
AutoCloseable
A container for storing outputs from a pipeline run.
In addition to storing outputs created from #createSinkForStep(RealizedStep)
the output container
should also store the:
- PipelineDeclaration
(if available)
- Manifest
- progress metrics MetricRegistry
-
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Finish writing to the output container.createSinkForStep
(SinkParameters parameters) Create aSink
for the given terminalStep.getStore()
Returns a URI that describes where the outputs are stored.default void
registerLocalFile
(Path localFile) Registers a local file to be stored as part of the pipeline's outputs.
-
Method Details
-
createSinkForStep
Create a
Sink
for the given terminalStep. This sink can then be used to 'save' tuples that form a model output. The format they are written in depends on what thePipelineOutputStore
supports.- Parameters:
parameters
- to tailor the created sink- Returns:
- sink or problems preventing one from being created
-
registerLocalFile
Registers a local file to be stored as part of the pipeline's outputs. The given file will potentially be moved or copied in to a separate location (alongside the other outputs) when the container is closed.
-
close
void close()Finish writing to the output container. At this point all
Sink
s should have been closed.If the store supports writing the progress metrics now is the time to write them.
- Specified by:
close
in interfaceAutoCloseable
-
getPipelineOutputs
- Returns:
- a map of output names to storage locations. TODO we might want to replace URI with something a bit more informative, such as number of rows, size, checksum etc
-
getStore
PipelineOutputStore getStore()- Returns:
- the store that created this container.
-
getStoredAt
URI getStoredAt()Returns a URI that describes where the outputs are stored. Note this relates to the pipeline output URIs in that this URI is typically a 'container' URI, but how these relate will typically depend on the format or type of storage
Note that at this stage we're not making any guarantees that this will be generally useful beyond being informational. Ideally, other components could reason about these URIs based on their structure and do something useful with them, e.g. if it looks like a local directory, you could open it. If it's a database URL, you could attempt to connect to it.
Note also that this URI might be different to the one the user gave, e.g. we might add a bunch of directories to the beginning of a file, or resolve/follow a URI to some other location, so this should be used in preference to the one the user gave when presenting info back to the user
-