public class TfIdf
extends java.lang.Object
Concepts: joining data; side inputs
To execute this pipeline locally, specify general pipeline configuration:
--project=
| Modifier and Type | Class and Description |
|---|---|
static class |
TfIdf.ComputeTfIdf
A transform containing a basic TF-IDF pipeline.
|
static class |
TfIdf.ReadDocuments
Reads the documents at the provided uris and returns all lines
from the documents tagged with which document they are from.
|
static class |
TfIdf.WriteTfIdf
A
PTransform to write, in CSV format, a mapping from term and URI
to score. |
| Constructor and Description |
|---|
TfIdf() |
| Modifier and Type | Method and Description |
|---|---|
static java.util.Set<java.net.URI> |
listInputDocuments(com.google.cloud.dataflow.examples.TfIdf.Options options)
Lists documents contained beneath the
options.input prefix/directory. |
static void |
main(java.lang.String[] args) |
public static java.util.Set<java.net.URI> listInputDocuments(com.google.cloud.dataflow.examples.TfIdf.Options options)
throws java.net.URISyntaxException,
java.io.IOException
options.input prefix/directory.java.net.URISyntaxExceptionjava.io.IOExceptionpublic static void main(java.lang.String[] args)
throws java.lang.Exception
java.lang.Exception