1 to N Identification
==============================

Overview
*********
The 1 to N functions allow you to enroll face recognition templates (Faceprints) into a database of face templates called a collection, then allow you to efficiently search through these collections for an identity.
Since the 1 to N functions are quite involved, it is highly recommended you read through all the information below, as well as reading through the FAQs.

Using Multiple SDK Instances in a Single Process
************************************************
In order to avoid loading the same collection into memory multiple times (which becomes an issue when the collection sizes become very large),
instances of the SDK created within the same process will share the same collection(s) in memory (RAM). This means when you enroll a Faceprint into the collection
using one instance of the SDK, it will be available in all other instances of the SDK in the same process.
For this same reason, applications which have multiple instances of the SDK in a single process only need to call :meth:`Trueface::SDK::createDatabaseConnection` and :meth:`Trueface::SDK::loadCollection`
on a single instance of the SDK and all other instances will automatically be connected to the same database and collection.
For more information on :meth:`Trueface::SDK::createDatabaseConnection` and :meth:`Trueface::SDK::createLoadCollection`, refer to "How do createDatabaseConnection and createLoadCollection work?" on the `FAQ page <faq.html#how-do-createdatabaseconnection-and-createloadcollection-work>`_.

Database Management Systems and Collection Synchronization
**********************************************************
The PostgreSQL backend option also has built in synchronization across multiple processes.
Let's take an example where you have two processes on different machines, A and B, connected to the same PostgreSQL backend.
Each of these processes will initially connect to the same database and collection and therefore load all the Faceprints from the database into memory (RAM).
If process A then enrolls a Faceprint into the collection, this will both add the Faceprint to the in-memory (RAM) collection of process A and will update the PostgreSQL database.
In doing so, it will also automatically push out a notification to all the subscribed processes which are connected to the same database and collection.
Any process connected to the same database and collection is automatically subscribed to updates, no additional action is required from the developer.
Process B will therefore receive a notification that an update was made and will therefore automatically enroll the same Faceprint into its in-memory (RAM) collection.
Process A and B therefore have synchronized collections in memory. By default, each node will check for notifications every 30 seconds, though this period can be modified by the ``DB_NOTIFICATION_CHECK_FREQUENCY`` environment variable.

This sort of multi-process synchronization is not supported by the sqlite backend. With the sqlite backend, if process A makes a change to the database,
process B will not know of the changes. Process B must re-call :meth:`Trueface::SDK::loadCollection` or :meth:`Trueface::SDK::loadCollections` in order to register the changes that were made to the database from process A.
Note doing so will not perform an incremental update, but will instead discard then re-load all the data into memory, which can be slow if the collection size is large.
This is why it is advised to use the sqlite backend option only for use cases which involve only a single process connecting to the database.
If multiple processes need to connect to a database (and require synchronization), it is advised to use the PostgreSQL backend.

Collection Memory Requirements
******************************
Using the following table, you can compute roughly how much RAM is required for storage of Faceprint templates (this does not include memory required for extracting Faceprints).
Although the following estimates are very conservative, realize that the exact size ultimately depends on the length of the ``identity`` string you choose.

.. list-table:: Faceprint Storage RAM Requirements
   :header-rows: 1

   * - Num Faceprint
     - TFV5_2/TFV6/TFV7
     - TFV5_2*/TFV6*/TFV7*
     - LITE
     - LITE*
   * - 1
     - 2250 Bytes
     - 1250 Bytes
     - 762 Bytes
     - 506 Bytes
   * - 1,000
     - 2.15 Mb
     - 1.19 Mb
     - 0.73 Mb
     - 0.48 Mb
   * - 1,000,000
     - 2.10 Gb
     - 1.16 Gb
     - 0.71 Gb
     - 0.47 Gb
   * - 10,000,000
     - 20.96 Gb
     - 11.64 Gb
     - 7.07 Gb
     - 4.71 Gb
   * - 100,000,000
     - 209.55 Gb
     - 116.42 Gb
     - 70.97 Gb
     - 47.12 Gb

*:class:`Trueface::ConfigurationOptions.frVectorCompression` flag enabled


For most use cases, even embedded devices have enough RAM to search through collections of medium to even large sizes (ex. An RPI 4 can handle a few million Faceprints).
However, when running 1 to N identification on massive collections (10s or 100s of millions of Faceprints) on a lightweight embedded device, you may find the device does
not have sufficient RAM to store the entire collection in memory. In these situations, you will want to run the actual 1 to N search on a beefy server which has sufficient RAM.
Process the video streams on the embedded devices at the edge to generate feature vectors for the detected faces, then send these feature vectors to the server (or cluster of servers)
to run the actual 1 to N identification functions (ex. :meth:`Trueface::SDK::identifyTopCandidate`). The server should also handle enrolling and deleting Faceprints from the collection as required
(these functions can also be exposed to the edge devices as REST API endpoints). Hence, the edge devices only generate feature vectors, while only the beefy servers are connected to the database and perform the searches.
To simplify things (and avoid having to write your own REST API server), you can have your edge devices send the feature vectors to an instance of the `PTOP <https://reference.trueface.ai/onprem/latest/index.html>`_ running on your server
to perform the matching.

Selecting the Best Enrollment Image
***********************************
It is imperative that enrollment images are of high quality. Enrolling low quality images into a collection can result in false positives (incorrect identifications).
For guidance on how to select the best image for enrollment, please visit our `FAQ Page <./faq.html#how-to-select-the-best-image-for-enrollment>`_.

We also advise you to save your enrollment images in a database of your own choosing (and map them to the UUID returned by :meth:`Trueface::SDK::enrollFaceprint`).
That way if in the future we release a new improved face recognition model, you will be able to regenerate a face recognition Faceprint for all your enrollment images using the new model.
The Trueface SDK does **not** store your images, it was deliberately designed to remain lean.


.. doxygenfunction:: Trueface::SDK::createDatabaseConnection

.. doxygenfunction:: Trueface::SDK::createLoadCollection

.. doxygenfunction:: Trueface::SDK::createCollection

.. doxygenfunction:: Trueface::SDK::loadCollection

.. doxygenfunction:: Trueface::SDK::loadCollectionPersist

.. doxygenfunction:: Trueface::SDK::loadCollections

.. doxygenfunction:: Trueface::SDK::deleteCollection

.. doxygenfunction:: Trueface::SDK::getCollectionNames

.. doxygenfunction:: Trueface::SDK::getLoadedCollectionNames

.. doxygenfunction:: Trueface::SDK::getCollectionMetadata

.. doxygenfunction:: Trueface::SDK::getCollectionIdentities

.. doxygenfunction:: Trueface::SDK::enrollFaceprint

.. doxygenfunction:: Trueface::SDK::removeByUUID

.. doxygenfunction:: Trueface::SDK::removeByIdentity

.. doxygenfunction:: Trueface::SDK::identifyTopCandidate

.. doxygenfunction:: Trueface::SDK::batchIdentifyTopCandidate

.. doxygenfunction:: Trueface::SDK::identifyTopCandidates

.. doxygenfunction:: Trueface::SDK::registerDatabaseDisconnectionCallback

.. doxygenstruct:: Trueface::Candidate
   :members:

.. doxygenstruct:: Trueface::CollectionMetadata
   :members: