Welcome to MKI's Dev Lab! As technology changes, we love to explore, highlight, and test novel concepts and developments.
Facial recognition technology is out there - and it's highly likely you have used it recently, perhaps when logging into a smart phone. There are a wide range of applications for this tech - from device authentication to surveillance to personalized security. But is it better than a thumbprint or the human eye at detecting who you are?
Like most technology, facial recognition is not perfect. Through our hands-on approach, we explore the features - including accuracy, bias, and limitations.
In our MKI Dev Lab (where our Chief Technology Officer Mark Lammert explores various software and coding concepts), we delve into the boundaries of facial recognition.
First, let's break down the functionality.
If the goal is to uniquely identify an individual by name, most of the time it will require an existing picture to cross-reference with a live image.
Example: when you use a smart phone and set it up for the first time, and you want to enable a face ID to access, your phone must scan your face. Recall how it has you tilt your head to different angles. This is necessary in order to provide a complete enough scan to uniquely recognize your face.
There are other modules, detailed in the features section below, that identify the attributes of a face, such as age and gender. This functionality requires machine learning or reference algorithms with any number of source pictures that are categorized and classified. The larger and higher quality the sample size, the more likely the algorithms are to be accurate.
Beware! As with any algorithm, there is inherent bias in the classification, so the technology will not always get it right. Over time and with intentional, collaborative development, algorithms can be improved.
Second, let's explore the features. Where does the tech get it right? Where does it fail? What biases can we find?
The first and arguably most important data point in this technology is the margin of error - how close are the results to being accurate?
The level of accuracy can be estimated based on the quality of the image and how many facial data points are captured. It is often expressed as a proportion, with 1.00 being 100% accurate. Very rarely will the technology give a 1.00, a very high accuracy reading is typically 0.99 or 0.98.
In the example below, there is a 0.99 accuracy rate on the analysis of the real time capture that directly relates to quality of image and best match. Where other attributes have a different degree of accuracy, it is noted in parentheses.
And you can clearly see that I am a person, not a cat or horse.
This technology does not worry about whether it is appropriate to ask for, or estimate someone's age. Not very PC at all!
Age is one of the attributes that will fluctuate quite a bit depending on the angle of the real time capture. Several examples show that this technology sometimes estimates my age in the 20s, and other times estimates me to be in the late 40s or even 50s! Most of the time, the age range does show mid-to-late thirties which is mostly accurate.
So clearly, age is not an element that is ever shown with 100%. In fact, note that this particular implementation does not show a degree of accuracy for age, rather the age fluctuates in real time.
Age is also an element that will vary from person-to-person. For instance, the degree to which your face distorts based on various expressions and the presence of make up are some variables that have an impact on the age determination.
There are many limitations to estimating age within this technology, and many biases could exist in determining someone's age within this context, especially as people age differently and the types of reference pictures and classifications will have a direct impact on the technology's ability to guess age.
We recognize that gender is another questionable element to estimate. Like age, gender will fluctuate depending on the image quality, angle, and facial expression. Unlike age, the gender attribute does include a degree of accuracy calculation. We also acknowledge the lack of inclusion within the technology for gender bias and identity.
Interestingly, as a female, my real time picture sometimes comes up with a male result with a 0.89 accuracy rating (which is quite high). Most of the time the results are relatively stable, but it is important to understand that hiding part of your face or changing your expression can produce unexpected results.
Another variable to note is that wearing makeup can increase the degree and likelihood of a female result (at least in this test implementation).
Of all the subjective elements to measure, determining expression has to be one of the most amusing! We really had fun testing this attribute.
Unlike other attributes, expression can have more than one result. Your face can register as neutral, happy, sad, disgusted, angry, surprised, or a combination of two or more! There are also degrees to which an expression will register, especially if your results indicate more than one expression.
We can all agree there is a lot of subjectivity that goes into interpreting someone's expression, so let's take this one with a big grain of salt.
To demonstrate the wide range of possibilities here, there are two examples:
1) Mark - note that neutral is consistently listed as the first result, with a different result coming in second. Mark's expressions are not very exaggerated with only a few of the facial points actually changing in each frame.
2) Me (Leah) - my face tends to register more exaggerated expressions, and I rarely have neutral listed as one of my expressions. This could also represent a bias in how I might be perceived through this analysis versus how Mark is perceived.
Finally, the feature that is probably most useful in real world context - the best match feature. It compares the real time picture against a pre-loaded database of pictures that have names associated. For this example, Mark loaded our professional head shots into the system with our respective names.
We experienced the most consistent results from best match. The application was pretty good at identifying us by name, even when I tilted my head to the side and partially covered my face. While those variables would skew other attributes, it would generally not disrupt the best match, unless too much of the face was covered.
Note that the degree of accuracy overall changes from 0.99 to 0.79 when part of my face is covered.
It's interesting to note that putting the feature combinations together will yield interesting and sometimes conflicting results. For instance, I was able to distort my face in a way that the technology registered it as a 58 year old man, but the best match correctly lists Leah, who is a woman in her mid-30s.
Another important question: is the technology smart enough to detect a picture in place of a real time face? In this implementation, the answer is no, although that is a feature that could likely be constructed.
For this example, I held up a picture of my 3 year old son, which correctly registered his age, gender, and expression. However, this exposes a vulnerability in the application, as I could theoretically walk around with a mask of another person over my face, and the technology might register the best match as that person or unknown, which would be incorrect.
This is a known limitation of many facial recognition programs.
Facial recognition technology is not about 100% accuracy, rather it is about the evolution of point-of-reference recognition algorithms that indicate a probable match. While it can automate a certain degree of matching and is more accurate in individualized situations (i.e. device authentication), when used in more complex real world applications may have biased results and still require human intervention and interpretation.
We continue to make strides in evaluating technology like facial recognition, understanding its point of reference and bias, and identifying use cases that lead us to effective digital transformation.