Facial Recognition Needs Diverse Data

During a recent 60 Minutes segment, Anderson Cooper investigated facial recognition software’s use in criminal investigations. Using complex mathematical algorithms, the facial recognition software compares a suspect’s face to potentially millions of other mugshots in a dataset. 

However, these algorithms are built and trained using a finite number of photos of a very demographically unbalanced dataset. Meaning, when it compares an image to millions of others, it will have a more challenging time distinguishing Black, Asian and female faces in particular. Once a suspect’s face is run through the software, it provides possible matches and ranks them in order of probability.

In the case of Robert Williams, police argue that his wrongful arrest was due to sloppy work done by humans, not the software. Ideally, data analysts review the results provided by the software to determine which results seem accurate, and only then could it be used as a lead and a lead only. Police cannot arrest or charge individuals based on facial recognition alone. But, human error and biased AI have led to an unknown number of wrongful arrests, but we know of at least three individuals who have filed lawsuits due to error. 

One issue lies with the lack of national guidelines around facial recognition. Local cities and agencies decide how to use it, who can run it, if formal training is needed and what kind of images can be used. In some cases, police photoshop a suspects’ facial features, especially when a suspect’s face is partially obscured. They edit someone else’s facial features to fill the gaps, but this also skews the accuracy of the results on top of using a problematic algorithm. 

It’s been challenging to acquire datasets that are diverse, private, yet easily accessible.

With TripleBlind, we offer the ability for these algorithms to be built and trained on real data, not modeled data; that way, there are no inherent biases. Algorithms can begin to train and learn from datasets that represent real faces of people with a variety of facial features. This hasn’t been done yet due to the lack of solutions that offer complete data privacy and integrity while being efficient and cost-effective.

One of TripleBlind’s most significant features is its compliance with HIPAA, GDPR and other regulatory standards. We offer the sole solution that successfully de-identities genomic data. We ensure that no one can be re-identified and that the data is never copied and never decrypted. With TripleBlind, we can start filling in the gaps of needed, diverse data for facial recognition to be balanced and trusted. 

Agencies and cities are facing costly settlements for wrongful arrests. It’s unknown how many other people have been wrongfully arrested, given that some arrested individuals never find out that facial recognition led to their arrest. Using facial recognition is a controversial practice and will be the subject of many laws and regulations that could make cities vulnerable to more lawsuits.