public static class PubsubIO.Read
extends java.lang.Object
PCollection<String> containing the items from
the stream.| Modifier and Type | Class and Description |
|---|---|
static class |
PubsubIO.Read.Bound<T>
A PTransform that reads from a PubSub source and returns
a unbounded PCollection containing the items from the stream.
|
| Constructor and Description |
|---|
PubsubIO.Read() |
| Modifier and Type | Method and Description |
|---|---|
static PubsubIO.Read.Bound<java.lang.String> |
dropLateData(boolean dropLateData)
If true, then late-arriving data from this source will be dropped.
|
static PubsubIO.Read.Bound<java.lang.String> |
idLabel(java.lang.String idLabel)
Creates and returns a PubSubIO.Read PTransform where unique record identifiers are
expected to be provided using the PubSub labeling API.
|
static PubsubIO.Read.Bound<java.lang.String> |
named(java.lang.String name) |
static PubsubIO.Read.Bound<java.lang.String> |
subscription(java.lang.String subscription)
Creates and returns a PubsubIO.Read PTransform for reading from
a specific Pubsub subscription.
|
static PubsubIO.Read.Bound<java.lang.String> |
timestampLabel(java.lang.String timestampLabel)
Creates and returns a PubsubIO.Read PTransform where record timestamps are expected
to be provided using the PubSub labeling API.
|
static PubsubIO.Read.Bound<java.lang.String> |
topic(java.lang.String topic)
Creates and returns a PubsubIO.Read PTransform for reading from
a Pubsub topic with the specified publisher topic.
|
static <T> PubsubIO.Read.Bound<T> |
withCoder(Coder<T> coder)
Creates and returns a PubsubIO.Read PTransform that uses the given
Coder<T> to decode PubSub record into a value of type T. |
public static PubsubIO.Read.Bound<java.lang.String> named(java.lang.String name)
public static PubsubIO.Read.Bound<java.lang.String> topic(java.lang.String topic)
/topics/<project>/<topic>, where <project> is the name of
the publishing project. The <topic> component must comply with
the below requirements.
Dataflow will start reading data published on this topic from the time the pipeline is started. Any data published on the topic before the pipeline is started will not be read by Dataflow.
public static PubsubIO.Read.Bound<java.lang.String> subscription(java.lang.String subscription)
/subscriptions/<project>/<<subscription>,
where <project> is the name of the project the subscription belongs to.
The <subscription> component must comply with the below requirements.
public static PubsubIO.Read.Bound<java.lang.String> timestampLabel(java.lang.String timestampLabel)
<timestampLabel> parameter
specifies the label name. The label value sent to PubsSub is a numerical value representing
the number of milliseconds since the Unix epoch. For example, if using the joda time classes,
org.joda.time.Instant.getMillis() returns the correct value for this label.
If <timestampLabel> is not provided, the system will generate record timestamps
the first time it sees each record. All windowing will be done relative to these timestamps.
By default windows are emitted based on an estimate of when this source is likely
done producing data for a given timestamp (referred to as the Watermark; see
AfterWatermark for more details).
Any late data will be handled by the trigger specified with the windowing strategy -- by
default it will be output immediately.
The {#dropLateData} field allows you to control what to do with late data. This relaxes
the semantics of GroupByKey; see
GroupByKey for additional information on
late data and windowing.
Note that the system can guarantee that no late data will ever be seen when it assigns
timestamps by arrival time (i.e. timestampLabel is not provided).
public static PubsubIO.Read.Bound<java.lang.String> dropLateData(boolean dropLateData)
If late data is not dropped, data for a window can arrive after that window has already
been closed. This relaxes the semantics of GroupByKey; see
GroupByKey
for additional information on late data and windowing.
public static PubsubIO.Read.Bound<java.lang.String> idLabel(java.lang.String idLabel)
<idLabel> parameter
specifies the label name. The label value sent to PubSub can be any string value that
uniquely identifies this record.
If idLabel is not provided, Dataflow cannot guarantee that no duplicate data will be delivered on the PubSub stream. In this case, deduplication of the stream will be stricly best effort.
public static <T> PubsubIO.Read.Bound<T> withCoder(Coder<T> coder)
Coder<T> to decode PubSub record into a value of type T.
By default, uses StringUtf8Coder, which just
returns the text lines as Java strings.
T - the type of the decoded elements, and the elements
of the resulting PCollection.