@Experimental(value=SOURCE_SINK) public class DatastoreV1 extends Object
DatastoreV1 provides an API to Read, Write and Delete PCollections
of Google Cloud Datastore version v1
Entity objects.
This API currently requires an authentication workaround. To use DatastoreV1, users
must use the gcloud command line tool to get credentials for Cloud Datastore:
$ gcloud auth login
To read a PCollection from a query to Cloud Datastore, use read()
and its methods DatastoreV1.Read.withProjectId(java.lang.String) and DatastoreV1.Read.withQuery(com.google.datastore.v1.Query) to
specify the project to query and the query to read from. You can optionally provide a namespace
to query within using DatastoreV1.Read.withNamespace(java.lang.String). You could also optionally specify
how many splits you want for the query using DatastoreV1.Read.withNumQuerySplits(int).
For example:
// Read a query from Datastore
PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create();
Query query = ...;
String projectId = "...";
Pipeline p = Pipeline.create(options);
PCollection<Entity> entities = p.apply(
DatastoreIO.v1().read()
.withProjectId(projectId)
.withQuery(query));
Note: Normally, a Cloud Dataflow job will read from Cloud Datastore in parallel across
many workers. However, when the Query is configured with a limit using
Query.Builder.setLimit(Int32Value), then
all returned results will be read by a single Dataflow worker in order to ensure correct data.
To write a PCollection to a Cloud Datastore, use write(),
specifying the Cloud Datastore project to write to:
PCollection<Entity> entities = ...;
entities.apply(DatastoreIO.v1().write().withProjectId(projectId));
p.run();
To delete a PCollection of Entities from Cloud Datastore, use
deleteEntity(), specifying the Cloud Datastore project to write to:
PCollection<Entity> entities = ...;
entities.apply(DatastoreIO.v1().deleteEntity().withProjectId(projectId));
p.run();
To delete entities associated with a PCollection of Keys from Cloud
Datastore, use deleteKey(), specifying the Cloud Datastore project to write to:
PCollection<Entity> entities = ...;
entities.apply(DatastoreIO.v1().deleteKey().withProjectId(projectId));
p.run();
Entities in the PCollection to be written or deleted must have complete
Keys. Complete Keys specify the name and id of the
Entity, where incomplete Keys do not. A namespace other than
projectId default may be used by specifying it in the Entity Keys.
Key.Builder keyBuilder = DatastoreHelper.makeKey(...);
keyBuilder.getPartitionIdBuilder().setNamespace(namespace);
Entities will be committed as upsert (update or insert) or delete mutations. Please
read Entities, Properties,
and Keys for more information about Entity keys.
PipelineRunner that is used to execute the
Dataflow job. Please refer to the documentation of corresponding PipelineRunners for
more details.
Please see Cloud Datastore Sign Up for security and permission related information specific to Cloud Datastore.
PipelineRunner| Modifier and Type | Class and Description |
|---|---|
static class |
DatastoreV1.DeleteEntity
A
PTransform that deletes Entities from Cloud Datastore. |
static class |
DatastoreV1.DeleteKey
|
static class |
DatastoreV1.Read
A
PTransform that reads the result rows of a Cloud atastore query as Entity
objects. |
static class |
DatastoreV1.Write
A
PTransform that writes Entity objects to Cloud Datastore. |
| Modifier and Type | Method and Description |
|---|---|
DatastoreV1.DeleteEntity |
deleteEntity()
Returns an empty
DatastoreV1.DeleteEntity builder. |
DatastoreV1.DeleteKey |
deleteKey()
Returns an empty
DatastoreV1.DeleteKey builder. |
DatastoreV1.Read |
read()
Returns an empty
DatastoreV1.Read builder. |
DatastoreV1.Write |
write()
Returns an empty
DatastoreV1.Write builder. |
public DatastoreV1.Read read()
DatastoreV1.Read builder. Configure the source projectId,
query, and optionally namespace and numQuerySplits using
DatastoreV1.Read.withProjectId(java.lang.String), DatastoreV1.Read.withQuery(com.google.datastore.v1.Query),
DatastoreV1.Read.withNamespace(java.lang.String), DatastoreV1.Read.withNumQuerySplits(int).public DatastoreV1.Write write()
DatastoreV1.Write builder. Configure the destination
projectId using DatastoreV1.Write.withProjectId(java.lang.String).public DatastoreV1.DeleteEntity deleteEntity()
DatastoreV1.DeleteEntity builder. Configure the destination
projectId using DatastoreV1.DeleteEntity.withProjectId(java.lang.String).public DatastoreV1.DeleteKey deleteKey()
DatastoreV1.DeleteKey builder. Configure the destination
projectId using DatastoreV1.DeleteKey.withProjectId(java.lang.String).