1 to N Identification

Overview

The 1 to N functions allow you to enroll face recognition templates (Faceprints) into a database of face templates called a collection, then allow you to efficiently search through these collections for an identity. Since the 1 to N functions are quite involved, it is highly recommended you read through all the information below, as well as reading through the FAQs.

Using Multiple SDK Instances in a Single Process

In order to avoid loading the same collection into memory multiple times (which becomes an issue when the collection sizes become very large), instances of the SDK created within the same process will share the same collection in memory (RAM). This means when you enroll a Faceprint into the collection using one instance of the SDK, it will be available in all other instances of the SDK in the same process. For this same reason, applications which have multiple instances of the SDK in a single process only need to call tfsdk.SDK.create_database_connection() and tfsdk.SDK.create_load_collection() on a single instance of the SDK and all other instances will automatically be connected to the same database and collection. For more information on tfsdk.SDK.create_database_connection() and tfsdk.SDK.create_load_collection(), refer to “How do createDatabaseConnection and createLoadCollection work?” on the FAQ page.

Database Management Systems and Collection Synchronization

The PostgreSQL backend option also has built in synchronization across multiple processes. Let’s take an example where you have two processes on different machines, A and B, connected to the same PostgreSQL backend. Each of these processes will initially connect to the same database and collection and therefore load all the Faceprints from the database into memory (RAM). If process A then enrolls a Faceprint into the collection, this will both add the Faceprint to the in-memory (RAM) collection of process A and will update the PostgreSQL database. In doing so, it will also automatically push out a notification to all the subscribed processes which are connected to the same database and collection. Any process connected to the same database and collection is automatically subscribed to updates, no additional action is required from the developer. Process B will therefore receive a notification that an update was made and will therefore automatically enroll the same Faceprint into its in-memory (RAM) collection. Process A and B therefore have synchronized collections. Note, it can take up to 30 seconds for subscribed processes to receive the notification.

This sort of multi-process synchronization is not supported by the sqlite backend. With the sqlite backend, if process A makes a change to the database, process B will not know of the changes. Process B must re-call tfsdk.SDK.create_load_collection() in order to register the changes that were made to the database from process A. Note doing so will not perform an incremental update, but will instead discard then re-load all the data into memory, which can be slow if the collection size is large. This is why it is advised to use the sqlite backend option only for use cases which involve only a single process connecting to the database. If multiple processes need to connect to a database (and require synchronization), it is advised to use the PostgreSQL backend.

Memory Requirements and Workarounds

Using the following table, you can compute roughly how much RAM is required for storage of Faceprint templates. Although the following estimates are very conservative, realize that the exact size ultimately depends on the length of the identity string you choose.

Faceprint Storage RAM Requirements

Num Faceprint

TFV5

TFV5*

FULL

FULL*

LITE

LITE*

1

2250 Bytes

1250 Bytes

2298 Bytes

750 Bytes

762 Bytes

506 Bytes

1,000

2.15 Mb

1.19 Mb

2.19 Mb

0.72 Mb

0.73 Mb

0.48 Mb

1,000,000

2.10 Gb

1.16 Gb

2.14 Gb

0.70 Gb

0.71 Gb

0.47 Gb

10,000,000

20.96 Gb

11.64 Gb

21.40 Gb

6.98 Gb

7.07 Gb

4.71 Gb

100,000,000

209.55 Gb

116.42 Gb

214.02 Gb

69.85 Gb

70.97 Gb

47.12 Gb

*tfsdk.ConfigurationOptions.fr_vector_compression flag enabled

For most use cases, even embedded devices have enough RAM to search through collections of medium to even large sizes (ex. An RPI 4 can handle a few million Faceprints). However, when running 1 to N identification on massive collections (10s or 100s of millions of Faceprints) on a lightweight embedded device, you may find the device does not have sufficient RAM to store the entire collection in memory. In these situations, you will want to run the actual 1 to N search on a beefy server which has sufficient RAM. Process the video streams on the embedded devices at the edge to generate feature vectors for the detected faces, then send these feature vectors to the server (or cluster of servers) to run the actual 1 to N identification functions (ex. tfsdk.SDK.identify_top_candidate()). The server should also handle enrolling and deleting Faceprints from the collection as required (these functions can also be exposed to the edge devices as REST API endpoints). Hence, the edge devices only generate feature vectors, while only the beefy servers are connected to the database and perform the searches. To simplify things (and avoid having to write your own REST API server), you can have your edge devices send the feature vectors to an instance of the PTOP running on your server to perform the matching.

Selecting the Best Enrollment Image

It is imperative that enrollment images are of high quality. Enrolling low quality images into a collection can result in false positives (incorrect identifications). For best performance, ensure the image meets the following criteria:

If the criteria above are not met, it is advised you reject the image and that you do not enroll it into a collection. For more information on how to use these functions to filter images, refer to the enroll_in_database.py sample app which comes shipped in the download bundle.

We also advise you to save your enrollment images in a database of your own choosing (and map them to the UUID returned by tfsdk.SDK.enroll_faceprint()). That way if in the future we release a new improved face recognition model, you will be able to regenerate a face recognition Faceprint for all your enrollment images using the new model. The Trueface SDK does not store your images, it was deliberately designed to remain lean.

SDK.create_database_connection(self: tfsdk.SDK, database_connection_string: str)tfsdk.ERRORCODE

Create a connection to a new or existing database. If the database does not exist, a new one will be created with the provided name. If the tfsdk.DATABASEMANAGEMENTSYSTEM.NONE (memory only) configuration option is selected, this function does not need to be called (and is a harmless no-op). If connecting to a hosted PostgreSQL database (AWS, Digital Ocean, etc), see the HOSTED_DATABASE environment variable on the environment variable page. If connecting to a PostgreSQL database hosted on the same machine, you must specify the “host” parameter. If connecting to a PostgreSQL database hosted on a remote server or a different machine on the same network, must specify the “hostaddr” parameter instead with the URL or IP of said database.

Parameters

database_connection_string – The database connection string. If tfsdk.DATABASEMANAGEMENTSYSTEM.SQLITE is selected, this should be the filepath to the database. ex. “/myPath/myDatabase.db”. If tfsdk.DATABASEMANAGEMENTSYSTEM.POSTGRESQL is selected, this should be a database connection string. Here is a list of all supported PostgreSQL connection parameters. ex. “hostaddr=192.168.1.0 port=5432 dbname=face_recognition user=postgres password=my_password” ex. “host=localhost port=5432 dbname=face_recognition user=postgres password=m_password”. To enable ssl, add “sslmode=require” to the connection string.

Returns

Error code, see ERRORCODE

SDK.create_load_collection(self: tfsdk.SDK, collection_name: str)tfsdk.ERRORCODE

Create a new collection, or load data from an existing collection into memory (RAM) if one with the provided name already exists in the database. Equivalent to calling tfsdk.SDK.create_collection() then tfsdk.SDK.load_collection().

Parameters

collection_name – the name of the collection to create and load.

Returns

Error code, see ERRORCODE

SDK.create_collection(self: tfsdk.SDK, collection_name: str)tfsdk.ERRORCODE

Create a new collection in the database. Before enrolling Faceprints into newly created collection, must call tfsdk.SDK.load_collection(). If the collection with the provided name already exists, this is a harmless no-op.

Parameters

collection_name – the name of the collection to create.

Returns

Error code, see ERRORCODE

SDK.load_collection(self: tfsdk.SDK, collection_name: str)tfsdk.ERRORCODE

Loads the collection into memory. Must be called before enrolling Faceprints or calling identification functions. Will return an error if the collection does not exist.

Parameters

collection_name – the name of the collection to load into memory.

Returns

Error code, see ERRORCODE

SDK.delete_collection(self: tfsdk.SDK, collection_name: str)tfsdk.ERRORCODE

Deletes a collection from the current database. Will return an error if the collection does not exist.

Parameters

collection_name – the name of the collection to delete.

Returns

Error code, see ERRORCODE

SDK.get_collection_names(self: tfsdk.SDK)Tuple[tfsdk.ERRORCODE, List[str]]

Get a list of the names of all the collections in the database. Collection names can then be passed to tfsdk.SDK.get_collection_metadata() and tfsdk.SDK.get_collection_identities().

Returns

The ERRORCODE and a list of the collection names in the database.

SDK.get_collection_metadata(self: tfsdk.SDK, collection_name: str)Tuple[tfsdk.ERRORCODE, tfsdk.CollectionMetadata]

Get the metadata for the specified collection in the database, loaded or unloaded.

Parameters

collection_name – the name of the collection for which to retrieve the metadata.

Returns

The ERRORCODE and the tfsdk.CollectionMetadata, in that order.

SDK.get_collection_identities(self: tfsdk.SDK, collection_name: str)Tuple[tfsdk.ERRORCODE, str]

Get a map of identities and UUIDs for the specified collection in the database, loaded or unloaded. This can be a slow operation (especially for unloaded collections), call sparingly.

Parameters

collection_name – the name of the collection for which to retrieve the identities.

Returns

The ERRORCODE and a json string representation of the collection, in that order. The json string is a map of UUIDs to identities. There can be multiple UUIDs mapped to a single identity.

SDK.enroll_faceprint(self: tfsdk.SDK, faceprint: tfsdk.Faceprint, identity: str)Tuple[tfsdk.ERRORCODE, str]

Enroll a Faceprint for a new or existing identity in the collection.

Parameters
Returns

The ERRORCODE and a UUID string corresponding to the enrolled Faceprint, in that order. It is advised the caller saves the UUID string.

SDK.remove_by_UUID(self: tfsdk.SDK, UUID: str)tfsdk.ERRORCODE

Remove a Faceprint from the collection using the UUID.

Parameters

UUID – the UUID returned by tfsdk.SDK.enroll_faceprint().

Returns

The ERRORCODE.

SDK.remove_by_identity(self: tfsdk.SDK, identity: str)Tuple[tfsdk.ERRORCODE, int]

Remove all Faceprints in the collection corresponding to the identity.

Parameters

identity – the identity to remove from the collection.

Returns

The ERRORCODE and an int representing the number of tfsdk.Faceprint removed from the collection.

SDK.identify_top_candidate(self: tfsdk.SDK, faceprint: tfsdk.Faceprint, threshold: float = 0.4)Tuple[tfsdk.ERRORCODE, bool, tfsdk.Candidate]

Get the top match candidate in the collection and the corresponding similarity score and match probability.

Parameters
  • faceprint – the probe tfsdk.Faceprint to be identified.

  • threshold – the similarity score threshold above which it is considered a match (default = 0.4). Higher thresholds may result in faster queries. Refer to the ROC curves when selecting a threshold.

Returns

The ERRORCODE, a bool indicating if a candidate was found, and the tfsdk.Candidate, in that order.

SDK.batch_identify_top_candidate(self: tfsdk.SDK, faceprints: List[tfsdk.Faceprint], threshold: float = 0.4)Tuple[tfsdk.ERRORCODE, List[bool], List[tfsdk.Candidate]]

Get the top match candidate in the collection and the corresponding similarity score and match probability for each probe faceprint. Like tfsdk.SDK.identify_top_candidate(), but runs search queries in parallel and improves throughput.

Parameters
  • faceprints – a list of probe tfsdk.Faceprint to be identified.

  • threshold

    the similarity score threshold above which it is considered a match (default = 0.4). Higher thresholds may result in faster queries. Refer to the ROC curves when selecting a threshold.

Returns

The ERRORCODE and a list of tfsdk.Candidate, in that order.

SDK.identify_top_candidates(self: tfsdk.SDK, faceprint: tfsdk.Faceprint, num_candidates: int, threshold: float = 0.4)Tuple[tfsdk.ERRORCODE, bool, List[tfsdk.Candidate]]

Get a list of the top n match candidates in the collection and their corresponding similarity scores and match probabilities.

Parameters
  • faceprint – the probe tfsdk.Faceprint to be identified.

  • num_candidates – the number of tfsdk.Candidate to return.

  • threshold

    the similarity score threshold above which it is considered a match (default = 0.4). Higher thresholds may result in faster queries. Refer to the ROC curves when selecting a threshold.

Returns

The ERRORCODE, a bool indicating if at least one match is found, and a list of tfsdk.Candidate, in that order.

class tfsdk.Candidate
property UUID

The UUID of the matching Faceprint.

property identity

The identity of the match.

property match_probability

The probability the two face feature vectors are a match.

property similarity_measure

The computed similarity measure.

to_dict(self: tfsdk.Candidate)dict

Return a dictionary representation of the object.

class tfsdk.CollectionMetadata
property collection_name

The name of the collection.

property encrypted

Indicates if the collection is encrypted.

property feature_vector_size_bytes

The size of the Faceprint feature vector in bytes.

property model_name

The name of the face recognition model used to generate the Faceprints enrolled in the collection.

property model_options

Additional options which were used to generate Faceprints enrolled in the collection.

property num_faceprints

The total number of Faceprints enrolled in the collection.

property num_identities

The number of unique identities in the collection.