Package com.cognite.client
Class EntityMatching
java.lang.Object
com.cognite.client.EntityMatching
This class represents the Cognite entity matching api endpoint
It provides methods for interacting with the entity matching services.
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected RequestaddAuthInfo(Request request) Adds the required authentication information into the request object.protected Aggregateaggregate(ResourceType resourceType, Request requestParameters) Performs an item aggregation request to Cognite Data Fusion.buildPartitionsList(int noPartitions) Builds an array of partition specifications for parallel retrieval from the Cognite api.create(Collection<Request> requests) Train a model that predicts matches between entities (for example, time series names to asset names).deDuplicate(Collection<Item> itemList) De-duplicates a collection ofItem.Deletes a set of entity matching models.abstract CogniteClientprotected Iterator<CompletableFuture<ResponseItems<String>>>getListResponseIterator(ResourceType resourceType, Request requestParameters) protected booleanitemsHaveId(Collection<Item> items) Returns true if all items contain either an externalId or id.listJson(ResourceType resourceType, Request requestParameters, String... partitions) Will return the results from alist / filterapi endpoint.listJson(ResourceType resourceType, Request requestParameters, String partitionKey, String... partitions) Will return the results from alist / filterapi endpoint.mapItemToId(Collection<Item> items) Maps all items to their externalId (primary) or id (secondary).static EntityMatchingof(CogniteClient client) Construct a newEntityMatchingobject using the provided configuration.parseItems(List<String> input) Parses a list of item object in json representation to typed objects.protected StringReturns the name attribute value from a json input.protected StringparseString(String itemJson, String fieldName) Try parsing the specified Json path as aString.predict(long modelId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets) Matches a set of source entities with a set of targets via a given matching model.predict(long modelId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets, int numMatches) Matches a set of source entities with a set of targets via a given matching model.predict(long modelId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets, int numMatches, double scoreThreshold) Matches a set of source entities with a set of targets via a given matching model.predict(String modelExternalId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets) Matches a set of source entities with a set of targets via a given matching model.predict(String modelExternalId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets, int numMatches) Matches a set of source entities with a set of targets via a given matching model.predict(String modelExternalId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets, int numMatches, double scoreThreshold) Matches a set of source entities with a set of targets via a given matching model.predict(Collection<Request> requests) Matches a set of source entities with a set of targets via a given matching model.retrieveJson(ResourceType resourceType, Collection<Item> items) Retrieve items by id.retrieveJson(ResourceType resourceType, Collection<Item> items, Map<String, Object> parameters) Retrieve items by id.toRequestItems(Collection<Item> itemList) Converts a list ofItemto a request object structure (that can later be parsed to Json).
-
Field Details
-
LOG
protected static final org.slf4j.Logger LOG
-
-
Constructor Details
-
EntityMatching
public EntityMatching()
-
-
Method Details
-
of
Construct a newEntityMatchingobject using the provided configuration. This method is intended for internal use--SDK clients should always useCogniteClientas the entry point to this class.- Parameters:
client- TheCogniteClientto use for configuration settings.- Returns:
- The datasets api object.
-
predict
public List<EntityMatchResult> predict(String modelExternalId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets) throws Exception Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training. The default number of matches is 1 and score threshold used for matching is 0.Example:
API Reference - Predict matchesString modelExternalId = // modelExternalId; List<Struct> sources = // sources ; List<Struct> targets = // targets; List<EntityMatchResult> result = client.contextualization() .entityMatching() .predict(modelExternalId, sources, targets);- Parameters:
modelExternalId- The external id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model training targets will be used.- Returns:
- The entity matching results.
- Throws:
Exception- See Also:
-
predict
public List<EntityMatchResult> predict(String modelExternalId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets, int numMatches) throws Exception Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training. The default score threshold used for matching is 0.Example:
API Reference - Predict matchesString modelExternalId = // modelExternalId; List<Struct> sources = // sources ; List<Struct> targets = // targets; List<EntityMatchResult> result = client.contextualization() .entityMatching() .predict(modelExternalId, sources, targets, 1);- Parameters:
modelExternalId- The external id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model training targets will be used.numMatches- The maximum number of match candidates per source.- Returns:
- The entity matching results.
- Throws:
Exception- See Also:
-
predict
public List<EntityMatchResult> predict(String modelExternalId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets, int numMatches, double scoreThreshold) throws Exception Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training.Example:
API Reference - Predict matchesString modelExternalId = // modelExternalId; List<Struct> sources = // sources ; List<Struct> targets = // targets; List<EntityMatchResult> result = client.contextualization() .entityMatching() .predict(modelExternalId, sources, targets, 1, 0d);- Parameters:
modelExternalId- The external id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model training targets will be used.numMatches- The maximum number of match candidates per source.scoreThreshold- The minimum score required for a match candidate.- Returns:
- The entity matching results.
- Throws:
Exception- See Also:
-
predict
public List<EntityMatchResult> predict(long modelId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets) throws Exception Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training. The default number of matches is 1 and score threshold used for matching is 0.Example:
API Reference - Predict matchesLong modelId = // modelId; List<Struct> sources = // sources ; List<Struct> targets = // targets; List<EntityMatchResult> result = client.contextualization() .entityMatching() .predict(modelId, sources, targets);- Parameters:
modelId- The internal id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model training targets will be used.- Returns:
- The entity matching results.
- Throws:
Exception- See Also:
-
predict
public List<EntityMatchResult> predict(long modelId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets, int numMatches) throws Exception Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training. The default score threshold used for matching is 0.Example:
API Reference - Predict matchesLong modelId = // modelId; List<Struct> sources = // sources ; List<Struct> targets = // targets; List<EntityMatchResult> result = client.contextualization() .entityMatching() .predict(modelId, sources, targets, 1);- Parameters:
modelId- The internal id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model traning targets will be used.numMatches- The maximum number of match candidates per source.- Returns:
- The entity matching results.
- Throws:
Exception- See Also:
-
predict
public List<EntityMatchResult> predict(long modelId, List<com.google.protobuf.Struct> sources, Collection<com.google.protobuf.Struct> targets, int numMatches, double scoreThreshold) throws Exception Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training.Example:
API Reference - Predict matchesLong modelId = // modelId; List<Struct> sources = // sources ; List<Struct> targets = // targets; List<EntityMatchResult> result = client.contextualization() .entityMatching() .predict(modelId, sources, targets, 1, 0d);- Parameters:
modelId- The internal id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model traning targets will be used.numMatches- The maximum number of match candidates per source.scoreThreshold- The minimum score required for a match candidate.- Returns:
- The entity matching results.
- Throws:
Exception- See Also:
-
predict
Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training.Example:
API Reference - Predict matchesList<Struct> sourceBatch = // List of Struct List<Request> requestBatches = new ArrayList<>(); requestBatches.add(Request.create().withRootParameter("sources", sourceBatch)); List<EntityMatchResult> result = client.contextualization() .entityMatching() .predict(requestBatches);- Parameters:
requests- input parameters for the predict jobs.- Returns:
- The entity match results.
- Throws:
Exception- See Also:
-
create
Train a model that predicts matches between entities (for example, time series names to asset names). This is also known as fuzzy joining. If there are no trueMatches (labeled data), you train a static (unsupervised) model, otherwise a machine learned (supervised) model is trained.Example:
API Reference - Create entity matcher modelList<Struct> sources = // sources ; List<Struct> targets = // targets; String[] modelTypes = {"simple", "insensitive", "bigram", "frequencyweightedbigram", "bigramextratokenizers", "bigramcombo"}; Request entityMatchFitRequest = Request.create() .withRootParameter("sources", sources) .withRootParameter("targets", targets) .withRootParameter("matchFields", Map.of("source", "name", "target", "externalId")) .withRootParameter("featureType", modelTypes[1]); List<EntityMatchModel> models = client.contextualization().entityMatching() .create(List.of(entityMatchFitRequest));- Parameters:
requests- Input parameters for the create model job(s).- Returns:
- The created entity match models
- Throws:
Exception- See Also:
-
delete
Deletes a set of entity matching models. The models to delete are identified via theirexternalId / idby submitting a list ofItem.Example:
API Reference - Delete entity matcher modelList<Item> entityMatchingModels = List.of(Item.newBuilder().setExternalId("1").build()); List<Item> deleteItemsResults = client.contextualization().entityMatching() .delete(entityMatchingModels);- Parameters:
entityMatchingModels- a list ofItemrepresenting the entity matching models (externalId / id) to be deleted- Returns:
- The deleted models via
Item - Throws:
Exception- See Also:
-
CogniteClientCogniteClient.contextualization()Contextualization.entityMatching()ApiBase.DeleteItems.deleteItems(List)
-
getClient
-
buildPartitionsList
Builds an array of partition specifications for parallel retrieval from the Cognite api. This specification is used as a parameter together with the filter / list endpoints. The number of partitions indicate the number of parallel read streams. Employ one partition specification per read stream.Example:
List<String> partitions = buildPartitionsList(getClient().getClientConfig().getNoListPartitions());- Parameters:
noPartitions- The total number of partitions- Returns:
- a
Listof partition specifications
-
listJson
protected Iterator<List<String>> listJson(ResourceType resourceType, Request requestParameters, String... partitions) throws Exception Will return the results from alist / filterapi endpoint. For example, thefilter assetsendpoint. The results are paged through / iterated over via anIterator--the entire results set is not buffered in memory, but streamed in "pages" from the Cognite api. If you need to buffer the entire results set, then you have to stream these results into your own data structure. This method support parallel retrieval via a set ofpartitionspecifications. The specified partitions will be collected and merged together before being returned via theIterator.Example:
Iterator<List<String>> result = listJson(resourceType, requestParameters, partitions);- Parameters:
resourceType- The resource type to query / filter / list. Ex.event, asset, time series.requestParameters- The query / filter specification. Follows the Cognite api request parameters.partitions- An optional set of partitions to read via.- Returns:
- an
Iteratorover the results set. - Throws:
Exception- See Also:
-
listJson
protected Iterator<List<String>> listJson(ResourceType resourceType, Request requestParameters, String partitionKey, String... partitions) throws Exception Will return the results from alist / filterapi endpoint. For example, thefilter assetsendpoint. The results are paged through / iterated over via anIterator--the entire results set is not buffered in memory, but streamed in "pages" from the Cognite api. If you need to buffer the entire results set, then you have to stream these results into your own data structure. This method support parallel retrieval via a set ofpartitionspecifications. The specified partitions will be collected and merged together before being returned via theIterator.Example:
Iterator<List<String>> result = listJson(resourceType, requestParameters, partitionKey, partitions);- Parameters:
resourceType- The resource type to query / filter / list. Ex.event, asset, time series.requestParameters- The query / filter specification. Follows the Cognite api request parameters.partitionKey- The key to use for the partitions in the read request. For examplepartitionorcursor.partitions- An optional set of partitions to read via.- Returns:
- an
Iteratorover the results set. - Throws:
Exception
-
retrieveJson
protected List<String> retrieveJson(ResourceType resourceType, Collection<Item> items) throws Exception Retrieve items by id. Will ignore unknown ids by default.Example:
Collection<Item> items = //Collection of items with ids; List<String> result = retrieveJson(resourceType, items); -
retrieveJson
protected List<String> retrieveJson(ResourceType resourceType, Collection<Item> items, Map<String, Object> parameters) throws ExceptionRetrieve items by id. This version allows you to explicitly set additional parameters for the retrieve request. For example:<"ignoreUnknownIds", true>and<"fetchResources", true>.Example:
Collection<Item> items = //Collection of items with ids; Map<String, Object> parameters = //Parameters; List<String> result = retrieveJson(resourceType, items, parameters); -
aggregate
protected Aggregate aggregate(ResourceType resourceType, Request requestParameters) throws Exception Performs an item aggregation request to Cognite Data Fusion. The default aggregation is a total item count based on the (optional) filters in the request. Some resource types, for exampleEvent, supports multiple types of aggregation.Example:
Aggregate aggregateResult = aggregate(resourceType,requestParameters);- Parameters:
resourceType- The resource type to perform aggregation of.requestParameters- The request containing filters.- Returns:
- The aggregation result.
- Throws:
Exception- See Also:
-
addAuthInfo
Adds the required authentication information into the request object. If the request object already have complete auth info nothing will be added. The following authentication schemes are supported: 1) API key. When using an api key, this service will look up the corresponding project/tenant to issue requests to.Example:
Request requestParams = addAuthInfo(request);- Parameters:
request- The request to enrich with auth information.- Returns:
- The request parameters with auth info added to it.
- Throws:
Exception
-
getListResponseIterator
protected Iterator<CompletableFuture<ResponseItems<String>>> getListResponseIterator(ResourceType resourceType, Request requestParameters) throws Exception - Throws:
Exception
-
parseItems
Parses a list of item object in json representation to typed objects.Example:
List<String> input = //List of json; List<Item> resultList = parseItems(input);- Parameters:
input- the item list in Json string representation- Returns:
- the parsed item objects
- Throws:
Exception
-
toRequestItems
Converts a list ofItemto a request object structure (that can later be parsed to Json).Example:
Collection<Item> itemList = //Collection of items; List<Map<String, Object>> result = toRequestItems(itemList);- Parameters:
itemList- The items to parse.- Returns:
- The items in request item object form.
-
deDuplicate
De-duplicates a collection ofItem.Example:
Collection<Item> itemList = //Collection of items; List<Item> result = deDuplicate(itemList);- Parameters:
itemList-- Returns:
-
itemsHaveId
Returns true if all items contain either an externalId or id.Example:
Collection<Item> items = //Collection of items; boolean result = itemsHaveId(items);- Parameters:
items-- Returns:
-
mapItemToId
Maps all items to their externalId (primary) or id (secondary). If the id function does not return any identity, the item will be mapped to the empty string. Via the identity mapping, this function will also perform deduplication of the input items.Example:
Collection<Item> items = //Collection of items; Map<String, Item> result = mapItemToId(items);- Parameters:
items- the items to map to externalId / id.- Returns:
- the
Mapwith all items mapped to externalId / id.
-
parseString
Try parsing the specified Json path as aString.Example:
String json = //String of json object String result = parseString(json, "name");- Parameters:
itemJson- The Json stringfieldName- The Json path to parse- Returns:
- The Json path as a
String.
-
parseName
Returns the name attribute value from a json input.Example:
String json = //String of json object String result = parseName(json);- Parameters:
json- the json to parse- Returns:
- The name value
-