Frequently Asked Questions
Some FAQ answers may use C++ code snippets; however, the answers apply to the python bindings SDK too.
How does face recognition feature extraction work?
A face recognition feature vector serves as a mathematical representation describing the abstract characteristics of a face within a higher-dimensional space. The process of face recognition feature extraction comprises several distinct steps. Initially, an algorithm is used to detect the presence of a face in the input image. Upon successful detection, the same algorithm proceeds to identify facial landmarks, including the eyes, nose, and mouth. The coordinates of these landmarks are then utilized in a step known as chip extraction, where the face is normalized and aligned. This normalization process ensures consistency in facial orientation and enhances the accuracy of subsequent recognition steps. Subsequently, the normalized face chip undergoes analysis by a face recognition machine learning algorithm. This algorithm, trained to discern unique facial patterns, generates a feature vector that succinctly captures the distinguishing characteristics of the face. This feature vector becomes a compact and informative representation, enabling efficient comparisons and recognition tasks in face recognition systems.
How does 1 to 1 face recognition matching work?
In a 1 to 1 face matching scenario, the process involves utilizing two input images to generate feature vectors, following the procedure outlined earlier. Each input image undergoes this feature vector extraction process to represent the unique facial characteristics in a higher-dimensional space.
Subsequently, a similarity metric is calculated between the two feature vectors, providing a quantifiable measure of their proximity in the higher-dimensional space. This metric reflects the degree of resemblance or dissimilarity between the facial features captured in the respective images.
To determine whether the two face images correspond to the same identity (genuine match) or belong to different identities (imposter match), a threshold is applied to the calculated similarity metric. This threshold serves as a decisive boundary, allowing the system to categorize matches based on their proximity score.
Selecting an appropriate threshold is a critical step in achieving accurate matching results. This is typically accomplished by consulting a Receiver Operating Characteristic (ROC) curve, which graphically represents the trade-off between true positive rate and false positive rate across different threshold values. The ROC curve aids in finding an optimal threshold that balances the sensitivity and specificity of the face matching system for reliable and efficient performance.
How does 1 to N face recognition search work?
In the 1 to N face recognition pipeline, the initial step involves enrolling feature vectors associated with known identities into a database. This process is typically carried out offline and is for creating a reference repository of individuals, such as those authorized to access a restricted area in an access control scenario.
When an image of an unknown identity is subsequently presented to the system, the algorithm calculates a similarity score metric between the unknown feature vector and all the feature vectors previously enrolled in the database. This comparison aims to find patterns and similarities between the presented face and the stored identities.
The system then identifies a match candidate by selecting the feature vector with the highest similarity score. If this score surpasses a predefined threshold, it indicates a potential match within the database. In such cases, the corresponding identity associated with the matched feature vector is returned, providing a streamlined and automated way to recognize and verify individuals in real-time applications.
How many threads does the SDK use for inference?
The following models can use up to 8 threads for inference:
Face recognition (LITE_V2, LITE_V3, TFV5_2, TFV6, TFV7)
Body pose estimation
Face image orientation detection
Face blur detection
Face template quality estimation
The following models can use up to 4 threads for inference:
106 face landmark detection
If the SDK is configured to use a global inference threadpool, then a single threadpool containing 8 threads will be shared between all inference sessions (in addition to the main driving thread).
Pre and post processing code uses a seperate threadpool and can use up to 8 threads.
Trueface::SDK::batchIdentifyTopCandidate() function is capable of using all the threads on your machine, depending on the number of probe Faceprints provided.
How can I reduce the number of threads used by the SDK?
OMP_NUM_THREADS environment variable to specify the number of threads used by the SDK in the inference threadpool and the pre / post processing threadpool.
Note, it is still possible for the SDK to exceed the specified value if
Trueface::ConfigurationOptions.useGlobalInferenceThreadpool is set to
false as each inference session will have it’s own threadpool whose size is dictated by this variable.
Trueface::ConfigurationOptions.useGlobalInferenceThreadpool is set to
true, the specified number of threads can still be exceeded if using more than one SDK instance or running a parallel inference pipeline (because the pre-post processing threadpool and inference threadpool may be alive at the same time).
The following graph shows the impact of inference threads on inference speed for the
What is a global inference threadpool and when should it be used?
The use of a global threadpool can be enabled or disabled through the
With a global inference threadpool, all inference sessions in the same process will share the same threadpool for inference.
This is the default inference mode and should be used when running inference in a sequential pipeline (ex. preprocess image –> face detection –> spoof detection –> face recognition –> preprocess image…).
It is also advised to use this mode anytime the machine has less than 32 threads. This mode should be used with horizontal scaling.
On larger machines with greater than 32 threads, it is advised to disable the use of a global threadpool. Instead of running inference in a sequential pipeline, you should instead run inference in parallel to see speed improvements. You can for example use a producer / consumer architecture with several models being run in parallel, the output of one being fed as the input to the next using shared queues. Being mindful that each inference session (face detection, face recognition, spoof detection, …) can use 4-8 threads, plan accordingly such that the majority of machine threads are fully utilized while avoiding contention through over saturation. This mode should be used with vertical scaling.
How can I increase throughput?
CPU: The CPU SDK is optimized for low latency, and does not currently support a high-throughput mode.
GPU: Throughput can be increased on GPU by using batching. Currently our face recognition and mask detection modules supports batch inference. You can also increase throughput (and decrease latency) by using images pre-loaded in GPU (ex. decode video stream directly into GPU ram).
Is the SDK threadsafe?
In CPU only mode, the SDK is completely threadsafe. In GPU mode, the SDK is not threadsafe.
How do I choose a similarity threshold for face recognition?
Navigate here and use the ROC curves to select a threshold based on your use case.
What are the differences between the face recognition models?
Each one of our face recognition models provides different speed and accuracy characteristics. You should consult our ROC curves page and benchmarks page to help choose which model is right for your deployment.
Are Faceprints compatible between models?
Faceprints are not compatible between models.
That means that if you have a collection filled with
Trueface::FacialRecognitionModel::TFV6 model Faceprints, you will not be able to run a query against that collection using a
The SDK has internal checks and will throw and error if you accidentally try to do this.
How can I upgrade my collection if is filled with Faceprints from a deprecated model?
As of right now, there is no way to upgrade your existing Faceprints to a new model (such as from
With this in mind, we advise you to save your enrollment images in a database of your own choosing.
That way, when we do release a new and improved face recognition model, you can re-generate Faceprints for all your images using the new model and enroll them into an updated collection.
What is the difference between similarity score and match probability?
The similarity score denotes the mathematical resemblance between two feature vectors within a vector space. Its values span from -1 to 1. To enhance human interpretability, a regression model transforms this similarity score into a match probability, ranging from 0 to 1. However, this conversion makes match probability less suitable for thresholding purposes. In essence, both the similarity score and match probability encapsulate the same underlying concept. Nevertheless, it’s crucial to note that the equation mapping from the similarity score to match probability may undergo occasional adjustments. Therefore, it is of utmost importance to base face recognition decisions, such as determining a match or non-match, on the similarity score rather than the match probability. The similarity score provides a more robust foundation for accurate and reliable decision-making in face recognition scenarios.
Why are small faces not being detected in my very high resolution images?
As of v2.0 of the SDK, the face detector has been modified such that all images are resized and padded to 640x640. Although this has many benefits, one of the drawbacks is that small faces may not be detected in very high resolution images. In a 640x640 image, the detector is able to detect faces as small as 20 pixels in height. Let’s consider a 4K image with dimensions 3840 x 2160. In order to resize to 640x640, the image is scaled down 6X and then padding is added to the top and bottom. A face which was 120 pixels large in the original 4K image is now 20 pixels in the 640x640 image, meaning any face smaller than 120 pixels in the original image will not be detected.
To summarize, the detector can only detect faces in the following dynamic height range. Smallest detectable face = ((the larger of your image dimensions) / 640 * 20) pixels. Largest detectable face = (your image height) pixels.
So in our example above: Smallest detectable face = (3840 / 640 * 20) = 120 pixels Largest detectable face = 2160 pixels
What is the impact of various factors on the face recognition similarity score?
Note, the following metrics were collected in a highly constrained environment and should only be used as guidance.
The following table quantifies the average reduction in similarity score when a face mask is synthetically added to a face image.
Avg. Reduction in Similarity Score
The following table quantifies the average reduction in similarity score when eyeglasses are synthetically added to a face image.
Avg. Reduction in Similarity Score
The following table quantifies the average reduction in similarity score for blurry images.
The comparisons are performed at the point at which sufficient gaussian blur has been applied to the original image such that
Avg. Reduction in Similarity Score
The following table quantifies the average reduction in similarity score for over exposed and under exposed images.
The over exposed comparisons are performed at the point at which brightness and exposure have been sufficiently increased such that
The under exposed comparisons are performed at the point at which brightness and exposure have been sufficiently decreased such that
Avg. Reduction in Similarity Score Over Exposed
Avg. Reduction in Similarity Score Under Exposed
The following charts quantify the average reduction in similarity score for various face heights.
The following charts quantify the average reduction in similarity score for various head yaw and pitch angles.
How to select the best image for enrollment?
In order to ensure optimal performance and reduce false positives, one should ensure that images, and in particular images used for enrollment, are of high quality. See What is the impact of various factors on the face recognition similarity score? to understand how some of the factors mentioned below can impact face recognition similarity score. Below, we breakdown what constitutes a high quality image.
- Face height
Even though the face detector can detect faces as small as 20 pixels in height, it is critical that the face height is sufficiently large to ensure high quality matches. We recommend a minimum face height of 40 pixels for face recognition, and suggest that be increased to 100 pixels for best performance.
Trueface::FaceBoxAndLandmarks.getHeight()to filter out small faces.
- Face orientation
Faces looking away from the camera may not be detected and can also result in lower quality face templates (less face information visible to camera). It is therefore recommended that face be oriented towards the camera within 30 degrees for pitch (head tilted up or down) and 40 degrees for yaw (head rotation left or right). There are no restrictions on head roll.
Trueface::SDK::estimateHeadOrientation()to obtain the face orientation (method returns orientation in radians, must then be converted to degrees).
- Image lighting
Ensure images are captured in appropriate lighting conditions. Is the image too dark or too bright? Is there non-uniform lighting? Is the face backlit? When possible, place the camera away from areas with extreme lighting.
Trueface::SDK::checkFaceImageExposure()to determine if an image is over or under exposed.
Strive for neutral, non-reflective backgrounds. Avoid backgrounds containing other faces.
When processing ID images (ex. passports), there may be some scenarios where a watermark of the face is larger than the primary subject face. In such cases, instead of calling
Trueface::SDK::detectLargestFace(), instead call
Trueface::SDK::detectFaces()and select the detected face at index 0. This will represent face with the highest detection score.
- Facial occlusions
Face recognition works best when a person’s entire face is visible. While face recognition may still work even with occlusions (ex. face mask), for optimal performance, ensure as much of the face as possible is visible to the camera.
Trueface::SDK::detectMask()to check if the person is wearing a face mask. If a mask is detected, the image can be rejected, or the user can be prompted to remove their mask.
Trueface::SDK::detectGlasses()to check if the person is wearing a eye glasses. If eyeglasses are detected, the image can be rejected, or the user can be prompted to remove their eyeglasses.
- Motion and blur
Blurry faces caused by motion or out-of-focus cameras make face recognition more challenging. Provide clear instructions for how the user should behave to facilitate the capture of high quality images.
Trueface::SDK::detectFaceImageBlur()to determine if an image contains blur or is of good quality for face recognition.
What is the difference between the static library and the dynamic library?
Functionality wise, there is no difference. Each provides it’s own benefits and disadvantages, which you can learn more about here. It should be noted that when using the static library, the following dependency libraries need to be manually linked:
Consult the sample code CMakeLists.txt file to guidance on how to do this.
What hardware does the GPU library support?
The x86-64 GPU enabled SDK supports NVIDIA GPUs with GPU Compute Capability 5.2+, and currently supports CUDA 11.8. The AArch64 GPU enabled SDK supports NVIDIA GPUs with GPU Compute Capability 5.3+, and currently supports CUDA 10.2 (default on NVIDIA Jetson devices).
You can determine your GPU Compute Capability here.
What is the TensorRT engine file and what is it used for?
As of V1.0 of the SDK, TensorRT is used for GPU inference for new models such as
The TensorRT engine file is an intermediate format of the model which is optimized for the target hardware based on the selected GPU configuration parameters.
If GPU inference is enabled, the SDK will search for the existence of this engine file in the directory specified by
If the engine file is not found, the SDK will generate one and save it to disk (once again, at the path specified by
If the engine file does already exist for the existing GPU options, the generation process is bypassed and the engine file is loaded from disk.
Please realize that the engine generation process can be slow and take a few minutes, particularly for the object detection models.
Note, changing the GPU configuration options such as
Trueface::GPUModuleOptions.precision will cause the engine file to be regenerated.
Additionally, the TensorRT engine file is locked to your GPU type. You can share generated engine files across devices only if they have the same GPU type.
Why is my license key not working with the GPU library?
The GPU library requires a different token which is generally tied to the GPU ID. Please speak with a sales representative to receive a GPU token.
Why does the first call to an inference function take much longer than the subsequent calls?
Our modules use lazy initialization meaning the machine learning models are only loaded into memory when the function is first called instead of on SDK initialization.
This ensure minimal memory overhead from unused modules.
If you know you will be using a module for inference, you can choose to initialize it in advance using the
Trueface::InitializeModule configuration option.
This option initializes specified modules in the SDK constructor instead of default lazy initialization.
What does a typical 1 to N face recognition pipeline involve?
Although there are various approaches you can take depending on your requirements and architecture (ex. largest face only vs all detected faces), the following pipelines should provide you with some guidance.
Offline steps - Enrollment:
The offline steps involve enrolling the
Trueface::Faceprint into our database of choosing.
We must therefore perform the following steps:
Perform the various checks described here
Online steps - Identification:
The online steps involve querying our database using probe
Trueface::Faceprint to determine if there is a potential match candidate.
We must therefore perform the following steps:
For either the largest face only or every detected face:
What architecture should I use when I have multiple camera streams producing lots of data?
For simple use cases, you can connect to a camera stream and do all of the processing directly onboard a single device. The following approaches are ideal for situations where you have lots of data to process. The first approach to consider is a publisher/subscriber architecture. Each camera producing data should push the image into a shared queue, then a worker or pool of workers can consume the data in the queue and perform the necessary operations on the data using the SDK. For a highly scalable design, split up all the components into microservices which can be distributed across many machines (with many GPUs) and use a Message Queue system such as RabbitMQ to facilitate communication and schedule tasks for each of the microservices. For maximum performance, be sure to use the GPU batch inference functions (see batch_fr_cuda sample app). Another approach is to process the camera stream data directly at the edge using embedded devices then either run identify at the edge too, or if dealing with massive collections, send the resulting Faceprints to a server / cluster of servers to run the identify calls. Refer to the 1 to N Identification tab (on the left) for more information on this approach. Finally, you can manage a cluster of PTOP instances using kubernetes or even run the instances on auto scaling cloud servers and post images directly to those for processing. These are just a few of the popular architectures you can follow, but there are many other correct approaches.
Why was setImage replaced by preprocessImage?
Trueface::SDK::setImage() function introduced state to the SDK which required the user to be mindful of thread safety when operating on a single SDK instance from multiple threads.
By removing state from the SDK, the CPU SDK is now completely threadsafe.
Additionally, the user of the SDK can now employ more versatile design patterns.
For example, the user can consume video from one thread, call the
Trueface::SDK::preprocessImage() method and enqueue the
Trueface::TFImage and then consume them from the queue for processing from a separate thread.
How do I use the python bindings for the SDK?
In order to run the sample apps, you must place the python bindings library in the same directory as the python script.
Alternatively, you can add the directory where the python bindings library resides to your
PYTHONPATH environment variable.
You may also need to add the directory where the supporting shared libraries (ex. ONNX Runtime) reside to your
LD_LIBRARY_PATH environment variable.
All you need to do from there is add
import tfsdk in your python script and you are ready to go.
How do createDatabaseConnection and createLoadCollection work?
Trueface::SDK::createDatabaseConnection() is used to establish a database connection.
This must always be called initially before loading Faceprints from a collection or enrolling Faceprints into a new collection,
Trueface::DatabaseManagementSystem::NONE options is being used, in which case it doesn’t need to be called and you can go straight to
Next, the user must call
Trueface::SDK::createLoadCollection() and provide a collection name.
If a collection with the specified name already exists in the database which we have connected to, then all the Faceprints in that collection will be loaded from the database collection into memory (RAM).
If no collection exists with the specified name, then a new collection is created. From here, the user can call
Trueface::SDK::enrollFaceprint() to save Faceprints to both the database collection and in-memory (RAM) collection.
If using the
Trueface::DatabaseManagementSystem::NONE, a new collection will always be created on a call to this function, and the previous loaded collection - if this is not the first time calling this function - will be deleted (since it is only stored in RAM).
From here, the 1 to N identification functions such as
Trueface::SDK::identifyTopCandidate() can be used to search through the collection for an identity.
What is the difference between loadCollection and loadCollectionPersist?
Trueface::SDK::loadCollection() method will load the specified collection from the database into memory.
In doing so, it will unload any previously loaded collections from memory.
For example, if collection A is loaded into memory, and then
Trueface::SDK::loadCollection() is called on collection B, collection A will be removed from memory and replaced by collection B.
The caller will then only be able to query collection B.
In contrast, the
Trueface::SDK::loadCollectionPersist() will load the specified collection from the database into memory while persisting any existing collections in memory.
In the above scenario, if collection A is initially loaded and
Trueface::SDK::loadCollectionPersist() is called on collection B, then collection A and B will subsequently both exist in memory and can both be queried.
How to load multiple collections in memory at once?
In order to load multiple collections in memory at once, use the
From here, the collection name must be specified for any subsequent 1 to N API call such as
What does the frVectorCompression flag do? When should I use it?
Trueface::ConfigurationOptions.frVectorCompression flag is used to enable optimizations in the SDK which compress the feature vector and improve the match speed for
both 1 to 1 comparisons and 1 to N identification.
It works by quantifying a 4 byte float to a 2 byte type. In doing so, nearly all the feature vector information is preserved while reducing the size of the feature vector by half. There is nearly no loss in accuracy from enabling this option.
What is the difference between the host and hostaddr PostgreSQL database connection parameters?
In order to connect to a PostgreSQL database, you must provide either the
host connection parameter.
hostaddr parameter is the numeric IP address where the database is hosted. Providing the
hostaddr parameter will avoid a host name lookup.
host parameter specifies the host name url of the database. Examples can include mydatabase.us-east-1.rds.amazonaws.com, localhost, etc.
host parameter is specified, a host name lookup will occur.