Class WorkerTask

java.lang.Object
nz.org.riskscape.engine.task.WorkerTask
All Implemented Interfaces:
AutoCloseable
Direct Known Subclasses:
AccumulatorProcessorTask, ChainTask, IndexBuilderTask, IndexEmitterTask, PageSplitterTask, RelationTask, SinkTask

public abstract class WorkerTask extends Object implements AutoCloseable

A WorkerTask is based off a TaskSpec and is what actually carries out the work for the pipeline step(s). When a TaskSpec supports parallel processing, it gets split into multiple WorkerTasks that all can run at the same time. The WorkerTasks get managed by the Scheduler and get assigned to a Worker thread to run.

  • Field Details

    • spec

      protected final TaskSpec spec
    • id

      protected final int id
    • pageWriter

      protected final PageWriter pageWriter
    • pageReader

      protected final PageReader pageReader
    • in

      protected final com.codahale.metrics.Meter in
    • out

      protected final com.codahale.metrics.Meter out
    • runtime

      protected final com.codahale.metrics.Counter runtime
    • runtimeAverage

      protected final com.codahale.metrics.Histogram runtimeAverage
    • processingResult

      protected Object processingResult

      Set this during run to pass the processing result to any dependents. It's good hygiene to set this only once processing is complete, but the Scheduler won't consume it until the task signals it is complete.

  • Constructor Details

    • WorkerTask

      public WorkerTask(TaskSpec spec)
    • WorkerTask

      public WorkerTask()
  • Method Details

    • run

      public abstract ReturnState run()

      Processes the work that the task has to do. The task doesn't necessarily run to completion in one go - it's likely that the task will run out of input or output first, so it'll need to keep coming back and chipping away at the work.

    • runPublic

      public ReturnState runPublic()
    • getPageReader

      public Optional<PageReader> getPageReader()

      Returns an optional helper for reading pages of input Tuples from the ReadPageBuffer (note that not all tasks have an input buffer though).

    • getPageWriter

      public Optional<PageWriter> getPageWriter()

      Returns an optional helper for writing pages of output Tuples to the WritePageBuffer (note that not all tasks have an output buffer though).

    • hasInputPage

      protected boolean hasInputPage()
    • hasOutputPage

      protected boolean hasOutputPage()
    • hasPageInProgress

      public boolean hasPageInProgress()

      Returns true if the task currently has a page of either input or output that it hasn't finished with yet.

    • isInputReady

      public final boolean isInputReady()

      Test for whether this task should run based on the state of the input buffer. Note that because of an awkward case in the scheduling logic, this method will return true if there are no tuples but the input buffer has been marked as complete. This allows the task to run once more and be run to completion. This is something we should probably be handling in the scheduler logic instead of this test.

      Returns:
      true if there are tuples in the input buffer to be read, or an incomplete page, or the input buffer has been marked as complete.
    • isOutputReady

      public final boolean isOutputReady()

      Test for whether this task should run based on the state of the output buffer.

      It will ignore the state of the output buffer if there's no more input - but this check may disappear once we have better test coverage and can confirm it's not relied upon. If it is, it should be in a task specific check.

      Returns:
      true if there is capacity in the output or the input is complete
    • isReadyToRun

      public boolean isReadyToRun()

      Returns true if the task has work it can do. Returns false if the task is blocked waiting on either more input, more output buffers to free up, or it's dependent on other tasks that haven't completed yet.

    • taskComplete

      protected ReturnState taskComplete()
    • getLastStep

      public nz.org.riskscape.engine.pipeline.RealizedStep getLastStep()
    • getFirstStep

      public nz.org.riskscape.engine.pipeline.RealizedStep getFirstStep()
    • getFirstStepRealizedResult

      protected nz.org.riskscape.engine.pipeline.Realized getFirstStepRealizedResult()
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • close

      public void close()

      Override this method to clean up any resources that were allocated/created by this task for use during execution. Will be called from the scheduler once the task has signaled it is complete, but it may also get called if a job that this task was part of has failed.

      Thread safety should be ensured by requiring the scheduler to only call close on a task that is not currently being run.

      Implementations shouldn't need to do any buffer management in this method, it's meant for closing things like TupleIterators or other sorts of resources that follow the Closeable pattern.

      Specified by:
      close in interface AutoCloseable
    • isCreated

      public boolean isCreated()
    • isStarted

      public boolean isStarted()
    • isComplete

      public boolean isComplete()
    • getName

      public final String getName()
      Returns:
      a unique name for this worker task, based on the step(s) it covers, and the type of worker it is.
    • getSpecNameBrief

      public String getSpecNameBrief()
      Returns:
      a unique name for the worker's TaskSpec that is brief and user-friendly. This can be used as a simple way to represent the work that this task is doing to the user.
    • producesResult

      public abstract boolean producesResult()
    • consumeProcessingResult

      public Object consumeProcessingResult()
      Returns:
      a processingResult that has been set, clearing its value in the process (to allow the result to be garbage collected in the fullness of time)
    • markStarted

      public void markStarted()
    • markComplete

      public void markComplete()
    • getSpec

      public TaskSpec getSpec()
    • getContext

      public nz.org.riskscape.engine.pipeline.ExecutionContext getContext()