Q: I heard something on the radio about Amazon building a robot to help them decide who to hire but the robot was sexist or something and now I’m confused. What’s going on?
A: What is up with all this advanced technology being racist and sexist all the time?! Dark skin not triggering automatic bathroom soap dispensers (bit.ly/2EGvngP)? Facial recognition software telling Asians to open their eyes for their passport photos (https://reut.rs/2S8NheC)? Shouldn’t tech be smarter than that? It’s an old problem that boils down to new technology not being tested on minority populations and it goes back at least to the invention of scissors for only right-handed folk.
You’re probably referring to the story “Amazon scraps secret AI recruiting tool that showed bias against women,” reported by Reuters at https://reut.rs/2D1MeJg. It wasn’t a robot—it was a computer program meant to help guide hiring decisions. And it wasn’t sexist; it was just responding to the input it was given. That’s what computers do, often to the chagrin of users and programmers alike.
Such stories have become increasingly common as big tech companies have tried to automate decisions that once fell to lowly humans. After all, if Alexa can understand speech well enough to turn off the lights, shouldn’t her cousins be able to sift through some résumés? One of the hopes for Artificial Intelligence was that it might make decisions based on pure data and rigorous math without succumbing to prejudices like people do.
What went wrong? One of the first lessons of programming is GIGO—“garbage in, garbage out.” It doesn’t matter how great your program is: bad inputs lead to bad outputs.
The type of AI used here is called Machine Learning (ML), because it mimics how humans learn. Consider how kids learn to distinguish cats from dogs. They don’t memorize a set of rules to apply whenever they see an animal. Instead, their parents point to Fido and say “dog,” or tell them to “pet the cat” when Whiskers saunters by.
Different neurons are activated when they see different animals, and eventually their brains associate certain patterns of neurons firing with cats and others with dogs. At that point, they have learned the difference.
Programmers simulate this brain behavior with a program called a neural network. These are trained by showing them thousands of labeled cat and dog pictures. Then they are tested when asked to categorize a new picture not seen before. If their training was successful—if they determined the patterns that differentiate a dog from a cat—they will correctly identify new pictures without human intervention.
This technology is powerful: it gave us Alexa, Siri, and Google Assistant. But because of GIGO, it also gave Amazon a “sexist” HR assistant. The quotes are because the program doesn’t really know about men and women—it only knows about inputs and outputs. Since it was trained with résumés that Amazon had accepted and rejected, it learned that Amazon didn’t hire many candidates who graduated from majority-women colleges or listed certain hobbies (softball, maybe, or ballet). Amazon did not provide details, but they did cancel the program.
It turns out that feeding biased data into a program designed to look for patterns causes it to learn the biases and then mimic them. GIGO.
The problem is not new. In 2016, ProPublica claimed to find it in software that predicts recidivism rates for criminal offenders (bit.ly/2JgDqzC). That same year, Microsoft had to shutter an AI-powered Twitter chatbot after twenty-four hours because it quickly learned to imitate homophobic slurs (https://for.tn/2D1M4Bo).
Even when the results aren’t disastrous, they can be just…weird. Witness Burger King’s advertising speak created by similar programs: bit.ly/2EFV1SL. Or check out a whole site dedicated to AI weirdness: aiweirdness.com.
What’s the solution? Feeding in unbiased data certainly helps, as does removing irrelevant data points. Some researchers report success with a statistical trick called oversampling—giving their programs three copies of each résumé from a woman, for example, so the program doesn’t notice a discrepancy in previous hiring practices.
We think the problems will persist until someone invents Artificial Judgment (AJ), to coin a term, to go along with AI. We’ll get on that right away, as soon as we extract this left-handed mouse from its packaging with our new left-handed scissors.