DirectPipelineRunner and
DataflowPipelineRunner.See: Description
| Interface | Description |
|---|---|
| DataflowPipelineTranslator.TransformTranslator<TransformT extends PTransform> |
A
DataflowPipelineTranslator.TransformTranslator knows how to translate
a particular subclass of PTransform for the
Cloud Dataflow service. |
| DataflowPipelineTranslator.TranslationContext |
The interface provided to registered callbacks for interacting
with the
DataflowPipelineRunner, including reading and writing the
values of PCollections and side inputs (PCollectionViews). |
| DirectPipelineRunner.EvaluationContext |
The interface provided to registered callbacks for interacting
with the
DirectPipelineRunner, including reading and writing the
values of PCollections and PCollectionViews. |
| DirectPipelineRunner.EvaluationResults |
The interface provided to registered callbacks for interacting
with the
DirectPipelineRunner, including reading and writing the
values of PCollections and PCollectionViews. |
| DirectPipelineRunner.TransformEvaluator<TransformT extends PTransform> |
An evaluator of a PTransform.
|
| PipelineRunnerRegistrar |
PipelineRunner creators have the ability to automatically have their
PipelineRunner registered with this SDK by creating a ServiceLoader entry
and a concrete implementation of this interface. |
| Class | Description |
|---|---|
| AggregatorPipelineExtractor |
An
AggregatorPipelineExtractor retrieves Aggregators at each
ParDo and returns a Map of Aggregator to the
PTransforms in which it is present. |
| AggregatorValues<T> |
A collection of values associated with an
Aggregator. |
| BlockingDataflowPipelineRunner |
A
PipelineRunner that's like DataflowPipelineRunner
but that waits for the launched job to finish. |
| DataflowPipeline | |
| DataflowPipelineJob |
A DataflowPipelineJob represents a job submitted to Dataflow using
DataflowPipelineRunner. |
| DataflowPipelineRegistrar | |
| DataflowPipelineRegistrar.Options |
Register the
DataflowPipelineOptions and BlockingDataflowPipelineOptions. |
| DataflowPipelineRegistrar.Runner |
Register the
DataflowPipelineRunner and BlockingDataflowPipelineRunner. |
| DataflowPipelineRunner |
A
PipelineRunner that executes the operations in the
pipeline by first translating them to the Dataflow representation
using the DataflowPipelineTranslator and then submitting
them to a Dataflow service for execution. |
| DataflowPipelineRunner.StreamingPubsubIOWrite<T> |
Specialized implementation for
PubsubIO.Write for the Dataflow runner in streaming
mode. |
| DataflowPipelineRunnerHooks |
An instance of this class can be passed to the
DataflowPipelineRunner to add user defined hooks to be
invoked at various times during pipeline execution. |
| DataflowPipelineTranslator |
DataflowPipelineTranslator knows how to translate Pipeline objects
into Cloud Dataflow Service API Jobs. |
| DataflowPipelineTranslator.JobSpecification |
The result of a job translation.
|
| DirectPipeline |
A
DirectPipeline is a Pipeline that returns
DirectPipelineRunner.EvaluationResults when it is
Pipeline.run(). |
| DirectPipelineRegistrar | |
| DirectPipelineRegistrar.Options |
Register the
DirectPipelineRunner. |
| DirectPipelineRegistrar.Runner |
Register the
DirectPipelineOptions. |
| DirectPipelineRunner |
Executes the operations in the pipeline directly, in this process, without
any optimization.
|
| DirectPipelineRunner.TestCombineDoFn<K,InputT,AccumT,OutputT> |
The implementation may split the
Combine.KeyedCombineFn into ADD, MERGE and EXTRACT phases (
see com.google.cloud.dataflow.sdk.runners.worker.CombineValuesFn). |
| DirectPipelineRunner.ValueWithMetadata<V> |
An immutable (value, timestamp) pair, along with other metadata necessary
for the implementation of
DirectPipelineRunner. |
| PipelineRunner<ResultT extends PipelineResult> |
A
PipelineRunner can execute, translate, or otherwise process a
Pipeline. |
| RecordingPipelineVisitor |
Provides a simple
Pipeline.PipelineVisitor
that records the transformation tree. |
| TransformHierarchy |
Captures information about a collection of transformations and their
associated
PValues. |
| TransformTreeNode |
Provides internal tracking of transform relationships with helper methods
for initialization and ordered visitation.
|
| Exception | Description |
|---|---|
| AggregatorRetrievalException |
Signals that an exception has occurred while retrieving
Aggregators. |
| DataflowJobAlreadyExistsException |
An exception that is thrown if the unique job name constraint of the Dataflow
service is broken because an existing job with the same job name is currently active.
|
| DataflowJobAlreadyUpdatedException |
An exception that is thrown if the existing job has already been updated within the Dataflow
service and is no longer able to be updated.
|
| DataflowJobCancelledException |
Signals that a job run by a
BlockingDataflowPipelineRunner was updated during execution. |
| DataflowJobException |
A
RuntimeException that contains information about a DataflowPipelineJob. |
| DataflowJobExecutionException |
Signals that a job run by a
BlockingDataflowPipelineRunner fails during execution, and
provides access to the failed job. |
| DataflowJobUpdatedException |
Signals that a job run by a
BlockingDataflowPipelineRunner was updated during execution. |
| DataflowServiceException |
Signals there was an error retrieving information about a job from the Cloud Dataflow Service.
|
DirectPipelineRunner and
DataflowPipelineRunner.
DirectPipelineRunner executes a Pipeline
locally, without contacting the Dataflow service.
DataflowPipelineRunner submits a
Pipeline to the Dataflow service, which executes it on Dataflow-managed Compute Engine
instances. DataflowPipelineRunner returns
as soon as the Pipeline has been submitted. Use
BlockingDataflowPipelineRunner to have execution
updates printed to the console.
The runner is specified as part PipelineOptions.