Class WorkerTask
- All Implemented Interfaces:
AutoCloseable
- Direct Known Subclasses:
AccumulatorProcessorTask
,ChainTask
,SinkTask
,TupleInputTask
A WorkerTask is based off a TaskSpec and is what actually carries out the work for the pipeline step(s). When a TaskSpec supports parallel processing, it gets split into multiple WorkerTasks that all can run at the same time. The WorkerTasks get managed by the Scheduler and get assigned to a Worker thread to run.
-
Field Summary
Modifier and TypeFieldDescriptionprotected final int
protected final com.codahale.metrics.Meter
protected final com.codahale.metrics.Meter
protected final PageReader
protected final PageWriter
protected Object
Set this duringrun
to pass the processing result to any dependents.protected final com.codahale.metrics.Counter
protected final com.codahale.metrics.Histogram
protected final TaskSpec
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Override this method to clean up any resources that were allocated/created by this task for use during execution.nz.org.riskscape.engine.pipeline.ExecutionContext
nz.org.riskscape.engine.pipeline.RealizedStep
protected nz.org.riskscape.engine.pipeline.Realized
nz.org.riskscape.engine.pipeline.RealizedStep
final String
getName()
Returns an optional helper for reading pages of input Tuples from the ReadPageBuffer (note that not all tasks have an input buffer though).Returns an optional helper for writing pages of output Tuples to the WritePageBuffer (note that not all tasks have an output buffer though).getSpec()
protected boolean
protected boolean
boolean
Returns true if the task currently has a page of either input or output that it hasn't finished with yet.boolean
boolean
final boolean
Test for whether this task should run based on the state of the input buffer.final boolean
Test for whether this task should run based on the state of the output buffer.boolean
Returns true if the task has work it can do.boolean
void
void
abstract boolean
abstract ReturnState
run()
Processes the work that the task has to do.protected ReturnState
toString()
-
Field Details
-
spec
-
id
protected final int id -
pageWriter
-
pageReader
-
in
protected final com.codahale.metrics.Meter in -
out
protected final com.codahale.metrics.Meter out -
runtime
protected final com.codahale.metrics.Counter runtime -
runtimeAverage
protected final com.codahale.metrics.Histogram runtimeAverage -
processingResult
Set this during
run
to pass the processing result to any dependents. It's good hygiene to set this only once processing is complete, but theScheduler
won't consume it until the task signals it is complete.
-
-
Constructor Details
-
WorkerTask
-
WorkerTask
public WorkerTask()
-
-
Method Details
-
run
Processes the work that the task has to do. The task doesn't necessarily run to completion in one go - it's likely that the task will run out of input or output first, so it'll need to keep coming back and chipping away at the work.
-
runPublic
-
getPageReader
Returns an optional helper for reading pages of input Tuples from the ReadPageBuffer (note that not all tasks have an input buffer though).
-
getPageWriter
Returns an optional helper for writing pages of output Tuples to the WritePageBuffer (note that not all tasks have an output buffer though).
-
hasInputPage
protected boolean hasInputPage() -
hasOutputPage
protected boolean hasOutputPage() -
hasPageInProgress
public boolean hasPageInProgress()Returns true if the task currently has a page of either input or output that it hasn't finished with yet.
-
isInputReady
public final boolean isInputReady()Test for whether this task should run based on the state of the input buffer. Note that because of an awkward case in the scheduling logic, this method will return true if there are no tuples but the input buffer has been marked as complete. This allows the task to run once more and be run to completion. This is something we should probably be handling in the scheduler logic instead of this test.
- Returns:
- true if there are tuples in the input buffer to be read, or an incomplete page, or the input buffer has been marked as complete.
-
isOutputReady
public final boolean isOutputReady()Test for whether this task should run based on the state of the output buffer.
It will ignore the state of the output buffer if there's no more input - but this check may disappear once we have better test coverage and can confirm it's not relied upon. If it is, it should be in a task specific check.
- Returns:
- true if there is capacity in the output or the input is complete
-
isReadyToRun
public boolean isReadyToRun()Returns true if the task has work it can do. Returns false if the task is blocked waiting on either more input, more output buffers to free up, or it's dependent on other tasks that haven't completed yet.
-
taskComplete
-
getLastStep
public nz.org.riskscape.engine.pipeline.RealizedStep getLastStep() -
getFirstStep
public nz.org.riskscape.engine.pipeline.RealizedStep getFirstStep() -
getFirstStepRealizedResult
protected nz.org.riskscape.engine.pipeline.Realized getFirstStepRealizedResult() -
toString
-
close
public void close()Override this method to clean up any resources that were allocated/created by this task for use during execution. Will be called from the scheduler once the task has signaled it is complete, but it may also get called if a job that this task was part of has failed.
Thread safety should be ensured by requiring the scheduler to only call close on a task that is not currently being run.
Implementations shouldn't need to do any buffer management in this method, it's meant for closing things like
TupleIterator
s or other sorts of resources that follow theCloseable
pattern.- Specified by:
close
in interfaceAutoCloseable
-
isCreated
public boolean isCreated() -
isStarted
public boolean isStarted() -
isComplete
public boolean isComplete() -
getName
- Returns:
- a unique name for this worker task, based on the step(s) it covers, and the type of worker it is.
-
getSpecNameBrief
- Returns:
- a unique name for the worker's TaskSpec that is brief and user-friendly. This can be used as a simple way to represent the work that this task is doing to the user.
-
producesResult
public abstract boolean producesResult() -
consumeProcessingResult
- Returns:
- a processingResult that has been set, clearing its value in the process (to allow the result to be garbage collected in the fullness of time)
-
markStarted
public void markStarted() -
markComplete
public void markComplete() -
getSpec
-
getContext
public nz.org.riskscape.engine.pipeline.ExecutionContext getContext()
-