{"id":4778,"date":"2021-06-26T06:33:54","date_gmt":"2021-06-26T06:33:54","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2021\/06\/26\/face-detection-explained-state-of-the-art-methods-and-best-tools\/"},"modified":"2021-06-26T06:33:54","modified_gmt":"2021-06-26T06:33:54","slug":"face-detection-explained-state-of-the-art-methods-and-best-tools","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2021\/06\/26\/face-detection-explained-state-of-the-art-methods-and-best-tools\/","title":{"rendered":"Face Detection Explained: State-of-the-Art Methods and Best Tools"},"content":{"rendered":"<p>Author: Oleksandr Tyron<\/p>\n<div>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9136999481?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9136999481?profile=RESIZE_710x\" class=\"align-center\"><\/a><\/p>\n<\/p>\n<p><span style=\"font-weight: 400;\">So many of us have used different Facebook applications to see us aging, turned into rock stars, or applied festive make-up. Such waves of facial transformations are usually accompanied by warnings not to share images of your faces \u2013 otherwise, they will be processed and misused.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">But how does AI use faces in reality? Let\u2019s discuss state-of-the-art applications for face detection and recognition.<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137038095?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137038095?profile=RESIZE_710x\" class=\"align-full\"><\/a><br \/><\/span><\/p>\n<p><span style=\"font-weight: 400;\">First, detection and recognition are different tasks.<\/span> <b><i>Face detection<\/i><\/b> <span style=\"font-weight: 400;\">is the crucial part of face recognition determining the number of faces on the picture or video without remembering or storing details. It may define some demographic data like age or gender, but it cannot recognize individuals.<\/span><\/p>\n<p><b><i>Face recognition<\/i><\/b> <span style=\"font-weight: 400;\">identifies a face in a photo or a video image against a pre-existing database of faces. Faces indeed need <span style=\"font-size: 10pt;\">to be enrolled into the system to create the database of unique facial features. Afterward, the system breaks down a new image intro key features and compares them against the information stored in the database.<\/span><\/span><\/p>\n<\/p>\n<h1><span style=\"font-size: 18pt;\"><strong>Face detection methods\u00a0<\/strong><\/span><\/h1>\n<p><span style=\"font-weight: 400;\">First, the computer examines either a photo or a video image and tries to distinguish faces from any other objects in the background. There are methods that a computer can use to achieve this, compensating for illumination, orientation, or camera distance. <a href=\"https:\/\/www.researchgate.net\/publication\/3193340_Detecting_Faces_in_Images_A_Survey\">Yang, Kriegman, and Ahuja<\/a> presented a classification for face detection methods. These methods are divided into four categories, and the face detection algorithms could belong to two or more groups.<br \/><\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137045853?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137045853?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/span><\/p>\n<h2><span style=\"font-size: 18pt;\"><strong>Knowledge-based face detection<\/strong><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">This method relies on the set of rules developed by humans according to our knowledge. We know that a face must have a nose, eyes, and mouth within certain distances and positions with each other. The problem with this method is to build an appropriate set of rules. If the rules are too general or too detailed, the system ends up with many false positives. However, it does not work for all skin colors and depends on lighting conditions that can change the exact hue of a person\u2019s skin in the picture.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137055692?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137055692?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/span><\/p>\n<h2><span style=\"font-size: 18pt;\"><strong>Template matching<\/strong><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">The template matching method uses predefined or parameterized face templates to locate or detect the faces by the correlation between the predefined or deformable templates and input images. The face model can be constructed by edges using the edge detection method.\u00a0<br \/><\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137061890?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137061890?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/span><\/p>\n<p><span style=\"font-weight: 400;\">A variation of this approach is the<\/span> <b><i>controlled background technique<\/i><\/b><b>.<\/b> <span style=\"font-weight: 400;\">If you are lucky to have a frontal face image and a plain background, you can remove the background, leaving face boundaries.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For this approach, the software has several classifiers for detecting various types of front-on faces and some for profile faces, such as detectors of eyes, a nose, a mouth, and in some cases, even a whole body. While the approach is easy to implement, it is usually inadequate for face detection.<\/span><\/p>\n<\/p>\n<h2><span style=\"font-size: 18pt;\"><strong>Feature-based face detection<\/strong><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">The feature-based method extracts structural features of the face. It is trained as a classifier and then used to differentiate facial and non-facial regions. One example of this method is<\/span> <b>color-based face detection that scans<\/b> <span style=\"font-weight: 400;\">colored images or videos for areas with typical skin color and then looks for face segments.<\/span><\/p>\n<p><b><i>Haar Feature Selection<\/i><\/b> <span style=\"font-weight: 400;\">relies on similar properties of human faces to form matches from facial features: location and size of the eye, mouth, bridge of the nose, and the oriented gradients of pixel intensities. There are 38 layers of cascaded classifiers to obtain the total number of 6061 features from each frontal face. You can find some pre-trained classifiers<\/span> <a href=\"https:\/\/github.com\/opencv\/opencv\/tree\/master\/data\/haarcascades\"><span style=\"font-weight: 400;\">here<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/face-detection-for-beginners-e58e8f21aad9\"><i><span style=\"font-weight: 400;\">Source<\/span><\/i><\/a><\/p>\n<p><b>Histogram of Oriented Gradients (HOG)<\/b> <span style=\"font-weight: 400;\">is a feature extractor for object detection. The features extracted are the distribution (histograms) of directions of gradients (oriented gradients) of the image.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Gradients are typically large round edges and corners and allow us to detect those regions. Instead of considering the pixel intensities, they count the occurrences of gradient vectors to represent the light direction to localize image segments. The method uses overlapping local contrast normalization to improve accuracy.<br \/><\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137073095?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137073095?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/span><\/p>\n<h2><span style=\"font-size: 18pt;\"><strong>Appearance-based face detection<\/strong><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">The more advanced appearance-based method depends on a set of delegate training face images to find out face models. It relies on machine learning and statistical analysis to find the relevant characteristics of face images and extract features from them. This method unites several algorithms:<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><b><i>Eigenface-based algorithm<\/i><\/b> efficiently represents faces using Principal Component Analysis (PCA). PCA is applied to a set of images to lower the dimension of the dataset, best describing the variance of data. In this method, a face can be modeled as a linear combination of eigenfaces (set of eigenvectors). Face recognition, in this case, is based on the comparing of coefficients of linear representation.\u00a0<br \/><\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137084477?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9137084477?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/span><\/p>\n<p><b><i>Distribution-based<\/i><\/b> <b>algorithms<\/b> <span style=\"font-weight: 400;\">like PCA and Fisher\u2019s Discriminant define the subspace representing facial patterns. They usually have a trained classifier that identifies instances of the target pattern class from the background image patterns.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><b><i>Hidden Markov Model<\/i><\/b> is a standard method for detection tasks. Its states would be the facial features, usually described as strips of pixels.\u00a0<br \/><\/span><\/p>\n<p><b><i>Sparse Network of Winnows<\/i><\/b> <span style=\"font-weight: 400;\">defines two linear units or target nodes: one for face patterns and the other for non-face patterns.<\/span><\/p>\n<p><b><i>Naive Bayes Classifiers<\/i><\/b> <span style=\"font-weight: 400;\">compute the probability of a face to appear in the picture based on the frequency of occurrence of a series of the pattern over the training images.<\/span><\/p>\n<p><b><i>Inductive learning<\/i><\/b> <span style=\"font-weight: 400;\">uses such\u00a0 algorithms as\u00a0 Quinlan\u2019s\u00a0 C4.5 or Mitchell\u2019s FIND-S to detect faces starting with the most specific hypothesis and generalizing.\u00a0<\/span><\/p>\n<p><b><i>Neural networks,<\/i><\/b> <span style=\"font-weight: 400;\">such as GANs, are among the most recent and most powerful methods for<\/span> <span style=\"font-weight: 400;\">detection problems, including face detection, emotion detection, and face recognition.<\/span><\/p>\n<\/p>\n<p><strong>Video Processing: Motion-based face detection<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">In video images, you can use movement as a guide. One specific face movement is blinking, so if the software can determine a regular blinking pattern, it determines the face.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Various other motions indicate that the image may contain a face, such as flared nostrils, raised eyebrows, wrinkled foreheads, and opened mouths. When a face is detected and a particular face model matches with a specific movement, the model is laid over the face, enabling face tracking to pick up further face movements. The state-of-the-art solutions usually combine several methods, extracting features, for example, to be used in machine learning or deep learning algorithms.<\/span><\/p>\n<\/p>\n<h2><span style=\"font-size: 18pt;\"><strong>Face detection tools<\/strong><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">There are dozens of face detection solutions, both proprietary and open-source, that offer various features, from simple face detection to emotion detection and face recognition.<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><b>Proprietary face detection software<\/b><\/span><\/p>\n<p><a href=\"https:\/\/aws.amazon.com\/rekognition\/\"><span style=\"font-weight: 400;\">Amazon Rekognition<\/span><\/a> <span style=\"font-weight: 400;\">is based on deep learning and is fully integrated into the Amazon Web Service ecosystem. It is a robust solution both for face detection and recognition, and it is applicable to detect eight basic emotions like \u201chappy\u201d, \u201csad\u201d, \u201cangry\u201d, etc.\u00a0 Meanwhile, you can determine up to 100 faces in a single image with this tool. There is an<\/span> <a href=\"https:\/\/aws.amazon.com\/rekognition\/video-features\/?nc=sn&amp;loc=3&amp;dn=1\"><span style=\"font-weight: 400;\">option for video<\/span><\/a><span style=\"font-weight: 400;\">, and the pricing is different for different kinds of usage.\u00a0<\/span><\/p>\n<p><a href=\"https:\/\/www.faceplusplus.com\/\"><span style=\"font-weight: 400;\">Face++<\/span><\/a> <span style=\"font-weight: 400;\">is a face analysis cloud service that also has an offline SDK for iOS &amp; Android. You can perform an unlimited amount of requests, but just three per second. It also supports Python, PHP, Java, Javascript, C++, Ruby, iOS, Matlab, providing services like gender and emotion recognition, age estimation, and landmark detection.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">They primarily operate in China, are exceptionally well funded, and are known for their inclusion in Lenovo products. However, bear in mind that its parent company, Megvii has been sanctioned by the US government in late 2019.<\/span><\/p>\n<p><a href=\"https:\/\/lambdalabs.com\/face-recognition-api\"><span style=\"font-weight: 400;\">Face Recognition and Face Detection API (Lambda Labs<\/span><\/a><span style=\"font-weight: 400;\">) provides face recognition, facial detection, eye position, nose position, mouth position, and gender classification. It offers 1000 free requests per month.<\/span><\/p>\n<p><a href=\"https:\/\/www.kairos.com\/\"><span style=\"font-weight: 400;\">Kairos<\/span><\/a> <span style=\"font-weight: 400;\">offers a variety of image recognition solutions. Their API endpoints include identifying gender, age, facial recognition, and emotional depth in photo and video. They offer 14 days free trial with a maximum limit of 10000 requests, providing SDKs for PHP, JS, .Net, and Python.<\/span><\/p>\n<p><a href=\"https:\/\/azure.microsoft.com\/en-us\/services\/cognitive-services\/face\/\"><span style=\"font-weight: 400;\">Microsoft Azure Cognitive Services Face API<\/span><\/a> <span style=\"font-weight: 400;\">allows you to make 30000 requests per month, 20 requests per minute on a free basis. For paid requests, the price depends on the number of recognitions per month, starting from $1 per 1000 recognitions. Features include age estimation, gender and emotion recognition, landmark detection. SDKs support Go, Python, Java, .Net, andNode.js.<\/span><\/p>\n<p><a href=\"https:\/\/www.paravision.ai\/\"><span style=\"font-weight: 400;\">Paravision<\/span><\/a> <span style=\"font-weight: 400;\">is a face recognition company for enterprises providing self-hosted solutions. Face and activity recognition and COVID-19 solutions (face recognition with masks, integration with thermal detection, etc.) are among their services. The company has SDKs for C++ and Python.<\/span><\/p>\n<p><a href=\"https:\/\/www.trueface.ai\/\"><span style=\"font-weight: 400;\">Trueface<\/span><\/a> <span style=\"font-weight: 400;\">is also serving enterprises, providing features like gender recognition, age estimation, and landmark detection as a self-hosted solution.\u00a0<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><b>Open-source face detection solutions<\/b><\/span><\/p>\n<p><a href=\"https:\/\/github.com\/ageitgey\/face_recognition\"><span style=\"font-weight: 400;\">Ageitgey\/face_recognition<\/span><\/a> <span style=\"font-weight: 400;\">is a GitHub repository with 40k stars, one of the most extensive face recognition libraries. The contributors also claim it to be the \u201csimplest facial recognition API for Python and the command line.\u201d However, their drawbacks are the latest release as late as 2018 and 99.38% model recognition accuracy, which could be much better in 2021. It also does not have REST API.<\/span><\/p>\n<p><a href=\"https:\/\/github.com\/serengil\/deepface\"><span style=\"font-weight: 400;\">Deepface<\/span><\/a> <span style=\"font-weight: 400;\">is a framework for Python with 1,5k stars on GitHub, providing facial attribute analysis like age, gender, race, and emotion. It also provides REST API.\u00a0<\/span><\/p>\n<p><a href=\"https:\/\/github.com\/davidsandberg\/facenet\"><span style=\"font-weight: 400;\">FaceNet<\/span><\/a> <span style=\"font-weight: 400;\">developed by Google uses Python library for implementation. The repository boasts of 11,8k starts. Meanwhile, the last significant updates were in 2018. The accuracy of recognition is 99,65%, and it does not have REST API.\u00a0<\/span><\/p>\n<p><a href=\"https:\/\/github.com\/deepinsight\/insightface\"><span style=\"font-weight: 400;\">InsightFace<\/span><\/a> <span style=\"font-weight: 400;\">is another Python library with 9,2k stars in GitHub, and the repository is actively updating. The recognition accuracy is 99,86%. They<\/span> <a href=\"http:\/\/insightface.ai\/\"><span style=\"font-weight: 400;\">claim<\/span><\/a> <span style=\"font-weight: 400;\">to provide a variety of algorithms for face detection, recognition, and alignment.\u00a0<\/span><\/p>\n<p><a href=\"https:\/\/github.com\/SthPhoenix\/InsightFace-REST\"><span style=\"font-weight: 400;\">InsightFace-REST<\/span><\/a><span style=\"font-weight: 400;\">\u00a0 is an actively updating repository that \u201caims to provide convenient, easy deployable and scalable REST API for InsightFace face detection and recognition pipeline using FastAPI for serving and NVIDIA TensorRT for optimized inference.\u201d<\/span><\/p>\n<p><a href=\"https:\/\/opencv.org\/\"><span style=\"font-weight: 400;\">OpenCV<\/span><\/a> <span style=\"font-weight: 400;\">isn\u2019t an API, but it is a valuable tool with over 3,000 optimized computer vision algorithms. It offers many options for developers, including Eigenfacerecognizer, LBPHFacerecognizer, or lpbhfacerecognition face recognition modules.<\/span><\/p>\n<p><a href=\"https:\/\/cmusatyalab.github.io\/openface\/\"><span style=\"font-weight: 400;\">OpenFace<\/span><\/a> <span style=\"font-weight: 400;\">is a Python and Torch implementation of face recognition with deep neural networks. It rests on the CVPR 2015 paper<\/span> <a href=\"http:\/\/www.cv-foundation.org\/openaccess\/content_cvpr_2015\/app\/1A_089.pdf\"><span style=\"font-weight: 400;\">FaceNet: A Unified Embedding for Face Recognition and Clustering<\/span><\/a><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<\/p>\n<h2><span style=\"font-size: 18pt;\"><strong>Bottom line<\/strong><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Face detection is the first step for further face analysis, including recognition, emotion detection, or face generation. However, it is crucial to all other actions to collect all the necessary data for further processing. Robust face detection is a prerequisite for sophisticated recognition, tracking, and analytics tools and the cornerstone of computer vision.<\/p>\n<p> Originally posted on <a href=\"https:\/\/medium.com\/sciforce\/face-detection-explained-state-of-the-art-methods-and-best-tools-f730fca16294\" target=\"_blank\" rel=\"noopener\">SciForce blog<\/a>.<\/span><\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:1054932\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Oleksandr Tyron So many of us have used different Facebook applications to see us aging, turned into rock stars, or applied festive make-up. Such [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2021\/06\/26\/face-detection-explained-state-of-the-art-methods-and-best-tools\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":458,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4778"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=4778"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4778\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/473"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=4778"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=4778"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=4778"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}