Face Recognition Technology
The reliability of the system, high the percentage of recognition, accuracy and speed of identification are ensured by use by the specially developed algorithms. Face Capture can be used at airports, banks, casinos, public buildings, subways, factories, schools or in any other location where it makes sense to record the faces of visitors, with facilities for integration into existing VMS applications. The Face Capture GUI is very simple such that any operator can use all of its functions with just a minimal amount of training. The system is highly flexible, allowing images to be digitized and recorded in either color or monochrome with a storage capacity typically exceeding 12 months of facial data recording. Face Capture screen simultaneously shows the live camera shot and the latest sequence of captured images.

The following steps, all automated by the software, and completed in milliseconds in the background, are critical in the successful execution of the SecurOS FACE module.


Stage I. Localization of the Face
To locate the face, a so-called image pyramid is formed from the original image. An image pyramid is a set of copies of the original image at different scales, thus representing a set of different resolutions. A mask is moved pixel-wise over each image in the pyramid, and at each position the image section under the mask is passed to a function that assesses the similarity of the image section to a face.

If the similarity value is high enough, the presence of a face at that position and resolution is assumed. From that position and resolution, the position and size of the face in the original image can be calculated. From the position of the face, a first estimate of the eye positions can be derived. In a neighborhood around these estimated positions, a search for the exact eye positions is started. This search is very similar to the search for the face position, the main difference being that the resolution of the images in the pyramid is higher than the resolution at which the face was found before. The positions yielding the highest similarity values are taken as final estimates of the eye positions.
Stage II. Image Quality Check
To be usable for the subsequent steps, the part of the image occupied by the face has to meet certain quality requirements; e.g., it should not be too noisy or blurred. The quality is measured by means of a set of functions that are applied to the image. If the quality is considered too low, the image is rejected.
Stage III. Normalization and Pre-Processing
In the normalization step, the face is extracted, rotated and scaled such that the centers of the eyes lie at predefined positions. More precisely, they are positioned to lie on the same horizontal pixel row such that the midpoint of this row is aligned with the midpoint between the centers of the eyes.
The pre-processing step comprises, among other transformations, the elimination of very high and very low spatial frequencies and the normalization of contrast.
Stage IV. Feature Extraction
Feature extraction starts with local image transforms that are applied at fixed image locations. These transforms capture local information relevant for distinguishing people, e.g. the amplitudes at certain spatial frequencies in a local area. The results are collected in a vector.

A global transform is then applied to this vector. Using a large face-image database, the parameters of this transform were chosen to maximize the ratio of the inter-person variance to the intra-person variance in the space of the transformed vectors; i.e., the distances between vectors corresponding to images of different persons should be large compared to distances between vectors corresponding to images of the same person. The result of this transformation is another vector that represents the feature set of the processed face image.

Stage V. Creation and Comparison of Reference Set
For the creation of the reference set, several images are usually taken of each person during enrollment in order to better cover the range of possible appearances of that person’s face. The reference set generated for a person consists of up to five feature sets, which are the centers of clusters obtained through a clustering process on the feature sets created from those images.
The function that is used to compare a feature set with a reference set is simple and can be computed very fast. It makes identification a matter of seconds, even if a million reference sets have to be compared.

|