Facial Recognition
Facial Recognition
Facial recognition is a biometric method of identifying a person based on a photograph of their face. Biometric methods use biological traits to identify people. The human eye is naturally able to recognize people by looking at them. However, it recognizes known people much more easily than perfect strangers. Moreover, concentration span for a human eye is limited. As a result, it is not useful in longer surveillance tasks or comparing hundreds of images to find a match to a photograph. Therefore, computerized methods have been developed to perform the facial recognition. Identification of faces is important for security, surveillance, and in forensics.
Biometric methods have been under development since the late 1980s. In the 1990s the first commercial systems appeared on the market. The first large trial of the technology was in 2000, during Super Bowl XXXV in Tampa Bay, Florida. Spectators were photographed, without their knowledge, as they entered the stadium. The images were then compared to a police database.
Currently, the technology is used by police, forensic scientists, governments, private companies, the military, and casinos. The police use facial recognition for identification of criminals. Companies use it for securing access to restricted areas. Casinos use facial recognition to eliminate cheaters and dishonest money counters. Finally, in the United States, nearly half of the states use computerized identity verification, while the National Center for Missing and Exploited Children uses the technique to find missing children on the Internet. In Mexico, a voter database was compiled to prevent vote fraud. Facial recognition technology can be used in a number of other places, such as airports, government buildings, and ATMs (automatic teller machines), and to secure computers and mobile phones.
Computerized facial recognition is based on capturing an image of a face, extracting features, comparing it to images in a database, and identifying matches. As the computer cannot see the same way as a human eye can, it needs to convert images into numbers representing the various features of a face. The sets of numbers representing one face are compared with numbers representing another face.
The quality of the computer recognition system is dependent on the quality of the image and mathematical algorithms used to convert a picture into numbers. Important factors for the image quality are light, background, and position of the head. Pictures can be taken of a still or moving subjects. Still subjects are photographed, for example by the police (mug shots) or by specially placed security cameras (access control). However, the most challenging application is the ability to use images captured by surveillance cameras (shopping malls, train stations, ATMs), or closed-circuit television (CCTV ). In many cases the subject in those images is moving fast, and the light and the position of the head is not optimal.
The techniques used for facial recognition can be feature-based (geometrical) or template-based (photometric). The geometric method relies on the shape and position of the facial features. It analyzes each of the facial features, also known as nodal points, independently; it then generates a full picture of a face. The most commonly used nodal points are: distance between the eyes, width of the nose, cheekbones, jaw line, chin, and depth of the eye sockets. Although there are about 80 nodal points on the face, most software measures have only around a quarter of them. The points picked by the software to measure have to be able to uniquely differentiate between people. In contrast, the image or photometric-based methods create a template of the features and use that template to identify faces.
Algorithms used by the software tools are proprietary and are secret. The most common methods used are eigenfaces, which are based on principal component analysis (PCA) to extract face features. The analysis can be very accurate, as many features can be extracted and all of the image data is analyzed together; no information is discarded. Another common method of creating templates is using neural networks. Despite continuous improvements, none of the current algorithms is 100% correct. The best verification rates are about 90% correct. At the same time, the majority of systems claim 1% false accept rates. The most common reasons for the failures are: sensitivity of some methods to lighting, facial expressions, hairstyles, hair color, and facial hair.
Despite the differences in mathematical methods used, the face recognition analysis follows the same set of steps. The first step is image acquisition; once the image is captured, a head is identified. In some cases, before the feature extraction, it might be necessary to normalize the image. This is accomplished by scaling and rotating the image so that the size of the face and its positioning is optimal for the next step. After the image is presented to the computer, it begins feature extraction using one of the algorithms. Feature extraction includes localization of the face, detection of the facial features, and actual extraction. Eyes, nose, and mouth are the first features identified by most of the techniques. Other features are identified later. The extracted features are then used to generate a numerical map of each face analyzed.
The generated templates are then compared to images stored in the database. The database used may consist of mug shots, composites of suspects, or video surveillance images. This process creates a list of hits with scores, which is very similar to search results on the Internet. It is often up to the user to determine if the similarity produced is adequate to warrant declaration of a match. Even if the user does not have to make a decision, he or she is most likely determining the settings used later by the computer to declare a match.
Depending on the software used, it is possible to compare one-to-one or one-to-many. In the first instance, it would be a confirmation of someone's identity. In the second, it would be identification of a person. Another application of facial recognition is taking advantage of live, video-based surveillance. This can be used to identify people in retrospect, after their images were captured on the recording. It can also be used to identify a particular person during surveillance, while they are moving around. It can be useful for catching criminals in the act, cheaters in casinos, or in identifying terrorists.
Most of the earliest and current methods of face recognition are 2-dimensional (2-D). They use a flat image of a face. However, 3-D methods are also being developed and some are already available commercially. The main difference in 3-D analysis is the use of the shape of the face, thus adding information to a final template. The first step in a 3-D analysis is generation of a virtual mesh reflecting a person's facial shape. It can be done by using a near-infrared light to scan a person's face and repeating the process a couple of times. The nodal points are located on the mesh, generating thousands of reference points rather than 20–30 used by 2-D methods. It makes the 3-D methods more accurate, but also more invasive and more expensive. As a result, 2-D methods are the most commonly used.
An extension of facial recognition and 3-D methods is using computer graphics to reconstruct faces from skulls. This allows identification of people from skulls if all other methods of identification fail. In the past facial reconstruction was done manually by a forensic artist. Clay was applied to the skull following the contours of the skull until a face was generated. Currently the reconstruction can be computerized by taking advantage of head template creation by using landmarks on the skull and the ability to overlay it with computer-generated muscles. Once the face is generated, it is photographed and can be compared to various databases for identification in the same way as a live person's image.
The use of facial recognition is important in law enforcement, as the facial verification performed by a forensic scientist can help to convict criminals. For example, in 2003, a group of men was convicted in the United Kingdom for a credit card fraud based on facial verification. Their images were captured on a surveillance tape near an ATM and their identities were confirmed later by a forensic specialist using facial recognition tools.
Despite recent advances in the area, facial recognition in a surveillance system is often technically difficult. The main reasons are difficulties in finding the face by the system. These difficulties arise from people moving, wearing hats or sunglasses, and not facing the camera. However, even if the face is found, identification might be difficult because of the lighting (too bright or too dark), making features difficult to recognize. An important variable is also resolution of the image taken and camera angle. Normalization performed by the computer might not be effective if the incoming image is of poor quality.
One of the ways to improve image quality is to use fixed cameras, especially in places like airports, government buildings, or sporting venues. In such cases all the people coming through are captured by the camera in a similar pose, making it easier for the computer to generate a template and compare to a database.
While most people do not object to the use of this technology to identify criminals, there are fears that images of people can be taken at any time, anywhere, without their permission. However, it is clear that the ability of identifying people with 100% certainty using face recognition is still some time away. However, facial recognition is an increasingly important identity verification method.
see also Biometrics; Composite drawing.