Package com.cognite.client
Class EntityMatching
- java.lang.Object
-
- com.cognite.client.EntityMatching
-
public abstract class EntityMatching extends Object
This class represents the Cognite entity matching api endpoint It provides methods for interacting with the entity matching services.
-
-
Field Summary
Fields Modifier and Type Field Description protected static org.slf4j.LoggerLOG
-
Constructor Summary
Constructors Constructor Description EntityMatching()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected RequestaddAuthInfo(Request request)Adds the required authentication information into the request object.protected Aggregateaggregate(ResourceType resourceType, Request requestParameters)Performs an item aggregation request to Cognite Data Fusion.protected List<String>buildPartitionsList(int noPartitions)Builds an array of partition specifications for parallel retrieval from the Cognite api.List<EntityMatchModel>create(Collection<Request> requests)Train a model that predicts matches between entities (for example, time series names to asset names).protected List<Item>deDuplicate(Collection<Item> itemList)De-duplicates a collection ofItem.List<Item>delete(List<Item> entityMatchingModels)Deletes a set of entity matching models.abstract CogniteClientgetClient()protected Iterator<CompletableFuture<ResponseItems<String>>>getListResponseIterator(ResourceType resourceType, Request requestParameters)protected booleanitemsHaveId(Collection<Item> items)Returns true if all items contain either an externalId or id.protected Iterator<List<String>>listJson(ResourceType resourceType, Request requestParameters, String... partitions)Will return the results from alist / filterapi endpoint.protected Iterator<List<String>>listJson(ResourceType resourceType, Request requestParameters, String partitionKey, String... partitions)Will return the results from alist / filterapi endpoint.protected Map<String,Item>mapItemToId(Collection<Item> items)Maps all items to their externalId (primary) or id (secondary).static EntityMatchingof(CogniteClient client)Construct a newEntityMatchingobject using the provided configuration.protected List<Item>parseItems(List<String> input)Parses a list of item object in json representation to typed objects.protected StringparseName(String json)Returns the name attribute value from a json input.protected StringparseString(String itemJson, String fieldName)Try parsing the specified Json path as aString.List<EntityMatchResult>predict(long modelId, List<Struct> sources, Collection<Struct> targets)Matches a set of source entities with a set of targets via a given matching model.List<EntityMatchResult>predict(long modelId, List<Struct> sources, Collection<Struct> targets, int numMatches)Matches a set of source entities with a set of targets via a given matching model.List<EntityMatchResult>predict(long modelId, List<Struct> sources, Collection<Struct> targets, int numMatches, double scoreThreshold)Matches a set of source entities with a set of targets via a given matching model.List<EntityMatchResult>predict(String modelExternalId, List<Struct> sources, Collection<Struct> targets)Matches a set of source entities with a set of targets via a given matching model.List<EntityMatchResult>predict(String modelExternalId, List<Struct> sources, Collection<Struct> targets, int numMatches)Matches a set of source entities with a set of targets via a given matching model.List<EntityMatchResult>predict(String modelExternalId, List<Struct> sources, Collection<Struct> targets, int numMatches, double scoreThreshold)Matches a set of source entities with a set of targets via a given matching model.List<EntityMatchResult>predict(Collection<Request> requests)Matches a set of source entities with a set of targets via a given matching model.protected List<String>retrieveJson(ResourceType resourceType, Collection<Item> items)Retrieve items by id.protected List<Map<String,Object>>toRequestItems(Collection<Item> itemList)Converts a list ofItemto a request object structure (that can later be parsed to Json).
-
-
-
Method Detail
-
of
public static EntityMatching of(CogniteClient client)
Construct a newEntityMatchingobject using the provided configuration. This method is intended for internal use--SDK clients should always useCogniteClientas the entry point to this class.- Parameters:
client- TheCogniteClientto use for configuration settings.- Returns:
- The datasets api object.
-
predict
public List<EntityMatchResult> predict(String modelExternalId, List<Struct> sources, Collection<Struct> targets) throws Exception
Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training. The default number of matches is 1 and score threshold used for matching is 0.- Parameters:
modelExternalId- The external id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model traning targets will be used.- Returns:
- The entity matching results.
- Throws:
Exception
-
predict
public List<EntityMatchResult> predict(String modelExternalId, List<Struct> sources, Collection<Struct> targets, int numMatches) throws Exception
Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training. The default score threshold used for matching is 0.- Parameters:
modelExternalId- The external id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model traning targets will be used.numMatches- The maximum number of match candidates per source.- Returns:
- The entity matching results.
- Throws:
Exception
-
predict
public List<EntityMatchResult> predict(String modelExternalId, List<Struct> sources, Collection<Struct> targets, int numMatches, double scoreThreshold) throws Exception
Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training.- Parameters:
modelExternalId- The external id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model traning targets will be used.numMatches- The maximum number of match candidates per source.scoreThreshold- The minimum score required for a match candidate.- Returns:
- The entity matching results.
- Throws:
Exception
-
predict
public List<EntityMatchResult> predict(long modelId, List<Struct> sources, Collection<Struct> targets) throws Exception
Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training. The default number of matches is 1 and score threshold used for matching is 0.- Parameters:
modelId- The internal id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model traning targets will be used.- Returns:
- The entity matching results.
- Throws:
Exception
-
predict
public List<EntityMatchResult> predict(long modelId, List<Struct> sources, Collection<Struct> targets, int numMatches) throws Exception
Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training. The default score threshold used for matching is 0.- Parameters:
modelId- The internal id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model traning targets will be used.numMatches- The maximum number of match candidates per source.- Returns:
- The entity matching results.
- Throws:
Exception
-
predict
public List<EntityMatchResult> predict(long modelId, List<Struct> sources, Collection<Struct> targets, int numMatches, double scoreThreshold) throws Exception
Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training.- Parameters:
modelId- The internal id of the matching model to use.sources- A list of entities to match from. If the list is empty, the model training sources will be used.targets- A list of entities to match to. If the list is empty, the model traning targets will be used.numMatches- The maximum number of match candidates per source.scoreThreshold- The minimum score required for a match candidate.- Returns:
- The entity matching results.
- Throws:
Exception
-
predict
public List<EntityMatchResult> predict(Collection<Request> requests) throws Exception
Matches a set of source entities with a set of targets via a given matching model. If either sources or targets are empty lists, the entity matcher will use the sources/targets from the model training. All input parameters are provided via the request object.- Parameters:
requests- input parameters for the predict jobs.- Returns:
- The entity match results.
- Throws:
Exception
-
create
public List<EntityMatchModel> create(Collection<Request> requests) throws Exception
Train a model that predicts matches between entities (for example, time series names to asset names). This is also known as fuzzy joining. If there are no trueMatches (labeled data), you train a static (unsupervised) model, otherwise a machine learned (supervised) model is trained. All input parameters are provided via the request object.- Parameters:
requests- Input parameters for the create model job(s).- Returns:
- The created entity match models
- Throws:
Exception
-
delete
public List<Item> delete(List<Item> entityMatchingModels) throws Exception
Deletes a set of entity matching models. The models to delete are identified via theirexternalId / idby submitting a list ofItem.
-
getClient
public abstract CogniteClient getClient()
-
buildPartitionsList
protected List<String> buildPartitionsList(int noPartitions)
Builds an array of partition specifications for parallel retrieval from the Cognite api. This specification is used as a parameter together with the filter / list endpoints. The number of partitions indicate the number of parallel read streams. Employ one partition specification per read stream.- Parameters:
noPartitions- The total number of partitions- Returns:
- a
Listof partition specifications
-
listJson
protected Iterator<List<String>> listJson(ResourceType resourceType, Request requestParameters, String... partitions) throws Exception
Will return the results from alist / filterapi endpoint. For example, thefilter assetsendpoint. The results are paged through / iterated over via anIterator--the entire results set is not buffered in memory, but streamed in "pages" from the Cognite api. If you need to buffer the entire results set, then you have to stream these results into your own data structure. This method support parallel retrieval via a set ofpartitionspecifications. The specified partitions will be collected and merged together before being returned via theIterator.- Parameters:
resourceType- The resource type to query / filter / list. Ex.event, asset, time series.requestParameters- The query / filter specification. Follows the Cognite api request parameters.partitions- An optional set of partitions to read via.- Returns:
- an
Iteratorover the results set. - Throws:
Exception
-
listJson
protected Iterator<List<String>> listJson(ResourceType resourceType, Request requestParameters, String partitionKey, String... partitions) throws Exception
Will return the results from alist / filterapi endpoint. For example, thefilter assetsendpoint. The results are paged through / iterated over via anIterator--the entire results set is not buffered in memory, but streamed in "pages" from the Cognite api. If you need to buffer the entire results set, then you have to stream these results into your own data structure. This method support parallel retrieval via a set ofpartitionspecifications. The specified partitions will be collected and merged together before being returned via theIterator.- Parameters:
resourceType- The resource type to query / filter / list. Ex.event, asset, time series.requestParameters- The query / filter specification. Follows the Cognite api request parameters.partitionKey- The key to use for the partitions in the read request. For examplepartitionorcursor.partitions- An optional set of partitions to read via.- Returns:
- an
Iteratorover the results set. - Throws:
Exception
-
retrieveJson
protected List<String> retrieveJson(ResourceType resourceType, Collection<Item> items) throws Exception
Retrieve items by id.
-
aggregate
protected Aggregate aggregate(ResourceType resourceType, Request requestParameters) throws Exception
Performs an item aggregation request to Cognite Data Fusion. The default aggregation is a total item count based on the (optional) filters in the request. Some resource types, for exampleEvent, supports multiple types of aggregation.- Parameters:
resourceType- The resource type to perform aggregation of.requestParameters- The request containing filters.- Returns:
- The aggregation result.
- Throws:
Exception- See Also:
- Cognite API v1 specification
-
addAuthInfo
protected Request addAuthInfo(Request request) throws Exception
Adds the required authentication information into the request object. If the request object already have complete auth info nothing will be added. The following authentication schemes are supported: 1) API key. When using an api key, this service will look up the corresponding project/tenant to issue requests to.- Parameters:
request- The request to enrich with auth information.- Returns:
- The request parameters with auth info added to it.
- Throws:
Exception
-
getListResponseIterator
protected Iterator<CompletableFuture<ResponseItems<String>>> getListResponseIterator(ResourceType resourceType, Request requestParameters) throws Exception
- Throws:
Exception
-
parseItems
protected List<Item> parseItems(List<String> input) throws Exception
Parses a list of item object in json representation to typed objects.- Parameters:
input- the item list in Json string representation- Returns:
- the parsed item objects
- Throws:
Exception
-
toRequestItems
protected List<Map<String,Object>> toRequestItems(Collection<Item> itemList)
Converts a list ofItemto a request object structure (that can later be parsed to Json).- Parameters:
itemList- The items to parse.- Returns:
- The items in request item object form.
-
deDuplicate
protected List<Item> deDuplicate(Collection<Item> itemList)
De-duplicates a collection ofItem.- Parameters:
itemList-- Returns:
-
itemsHaveId
protected boolean itemsHaveId(Collection<Item> items)
Returns true if all items contain either an externalId or id.- Parameters:
items-- Returns:
-
mapItemToId
protected Map<String,Item> mapItemToId(Collection<Item> items)
Maps all items to their externalId (primary) or id (secondary). If the id function does not return any identity, the item will be mapped to the empty string. Via the identity mapping, this function will also perform deduplication of the input items.- Parameters:
items- the items to map to externalId / id.- Returns:
- the
Mapwith all items mapped to externalId / id.
-
parseString
protected String parseString(String itemJson, String fieldName)
Try parsing the specified Json path as aString.- Parameters:
itemJson- The Json stringfieldName- The Json path to parse- Returns:
- The Json path as a
String.
-
-