Tutorial: Facial Detection in Python
Facial Detection in Python.
To recognize faces, we are going to be using a library called OpenCV. OpenCV is an open-source computer vision library that is useful for doing loads of different stuff. This tutorial will walk you through setting up OpenCV and using it to get some basic facial detection up and running. The first step is installing OpenCV. Python has an easy to use package manager called Pip, so we can install OpenCV using the terminal, like this:
With OpenCV now installed, all we need to do is import it into our program as such:
OpenCV uses a system called Cascade Classifiers in order to perform image recognition. Essentially, a Cascade Classifier is a trained algorithm that tells OpenCV how to recognize something. You can read more about it here if you want. In practice, it's a roughly 33,000 line XML file that my puny human brain has no hope of understanding. Fortunately, OpenCV has a bunch of ready made Cascade Classifiers that you can download and use. So, after downloading the one for faces, which you can find here, we set it up as such:
Here is where I ran into an issue: for some reason, Python refused to open the cascade file, citing a permission error: [ERROR:0@0.028] global persistence.cpp:505 cv::FileStorage::Impl::open Can't open file: 'haarcascade_frontalface_default.xml' in read mode. This error was tricky to pin down because one, there weren't any ten year old forum posts about it, and secondly, it is a LIE. Well, sort of. I believe that this is an issue with Python's sandboxing. For my development environment, I use VSCode. As such, I had the main project folder open, and my Python code was nested in a subfolder (the same subfolder that contained the cascade file, so that was not the issue). To test my program, I was using the 'Run Program' hotkey in VSCode. The issue seemed to be that my program was being run in the main project folder, rather than the subfolder containing the cascade file. There are two easy remedies to this: the first is to simply open the subfolder in VSCode instead of the main folder. This will allow the run program button to function as intended, but it may prove to be a hassle if you need to work on files outside of the subfolder. The second solution is to navigate to the subfolder in the terminal (ex: cd ./lib), and then run your program from the command line (ex: python myprogram.py). Both of these solutions will result in the error being resolved, and I figure I may as well preserve them here for posterity. You're welcome, weary traveler 10 years from now.
With that minor annoyance taken care of, it's time to get an image to detect a face in. To do this, we'll get the video feed from the device's camera as such:
We now have an image from the camera, and it's time to analyze it for faces. There is just one small problem, the image we have is in color, and color creates a lot of overhead for computer vision. To remedy this and improve performance, we will convert the image to grayscale:
With our new grayscale image, it's time to use it to finally perform facial detection. For this, we will use detectMultiScale(). detectMultiScale takes in three arguments: the image to process, the scale factor, and the number of neighbors. For the image to process, we will pass it our grayscale image. The scale factor specifies how much to reduce the image size by at each scale, all you need to know is that the higher this number, the faster the detection will work, albeit with lower accuracy. The minimum value for the scale factor is 1.0.
The number of neighbors is how many neighbors each possible detection needs in order to not get thrown out. Higher numbers reduce the chance of a false positive, but can lead to more false negatives.
After a bit of playing with the values, here's what I came up with:
And in terms of facial detection, that's literally it. The variable face now contains the (x,y) coordinate of the face (if detected), as well as the width and height of the face. To help us visualize it, we'll go ahead and display some text regarding detection status, as well as draw a rectangle around the face before displaying it to the output window:
All that's left is to listen for a keypress to terminate the loop (I picked the escape key), and clean up after ourselves:
Success.
We now have fully working facial detection! Do what you want with this, I suppose.