public class TrafficStreamingMaxLaneFlow
extends java.lang.Object
Concepts: The streaming runner, sliding windows, PubSub topic ingestion, use of the AvroCoder to encode a custom class, and custom Combine transforms.
This pipeline takes as input traffic sensor data from a PubSub topic, and analyzes it using SlidingWindows. For each window, it finds the lane that had the highest flow recorded, for each sensor station. It writes those max values along with auxiliary info to a BigQuery table.
This pipeline expects input from this script, which publishes traffic sensor data to a PubSub topic. After you've started this pipeline, start up the input generation script as per its instructions. The default SlidingWindow parameters assume that you're running this script with the --replay flag, which simulates pauses in the sensor data publication.
To run this example using the Dataflow service, you must provide an input PubSub topic and an output BigQuery table, using the --inputTopic, --dataset, and --table options. Since this is a streaming pipeline that never completes, select the non-blocking pipeline runner by specifying --runner=DataflowPipelineRunner.
When you are done running the example, cancel your pipeline so that you do not continue to be charged for its instances. You can do this by visiting https://console.developers.google.com/project/your-project-name/dataflow/job-id in the Developers Console. You should also terminate the generator script so that you do not use unnecessary PubSub quota.
| Modifier and Type | Class and Description |
|---|---|
static class |
TrafficStreamingMaxLaneFlow.MaxFlow
A custom 'combine function' used with the Combine.perKey transform.
|
| Constructor and Description |
|---|
TrafficStreamingMaxLaneFlow() |
| Modifier and Type | Method and Description |
|---|---|
static void |
main(java.lang.String[] args)
Sets up and starts streaming pipeline.
|