- All Implemented Interfaces:
- com.google.cloud.dataflow.sdk.transforms.display.HasDisplayData, Serializable
- Enclosing class:
- TfIdf
public static class TfIdf.ComputeTfIdf
extends PTransform<PCollection<KV<URI,String>>,PCollection<KV<String,KV<URI,Double>>>>
A transform containing a basic TF-IDF pipeline. The input consists of KV objects
where the key is the document's URI and the value is a piece
of the document's content. The output is mapping from terms to
scores for each document URI.
- See Also:
- Serialized Form