What is algorithmic bias in AI?
For many years, the world thought that artificial intelligence does not hold the biases and prejudices that its creators hold. Everyone thought that since AI is driven by cold, hard mathematical logic, it would be completely unbiased and neutral.
The world was wrong.
AI has, in many cases, manifested the biases that humans tend to hold. In some instances, it has even amplified these biases.
Algorithmic bias refers to the lack of fairness in the outputs generated by an algorithm. These biases may include age discrimination, gender bias, and racial bias.
Algorithmic biases could intensify the inequalities among people and affect their lives. Another risk is that a person’s chances of landing a job that they deserve could be reduced, simply because they belong to a group that the algorithm is biased against.
What causes algorithmic bias?
A bias in the output generated by an algorithm would generally be because of the data that it was trained on.
There are two prominent reasons why the training data could cause algorithmic biases. Firstly, it could be caused by personal biases that the data gatherers themselves hold. Secondly, it could be because of environmental biases that could have been imposed unintentionally (or even intentionally) while the data was being gathered.
The people who are gathering the data most likely have biases that they are not even aware of and they end up projecting those biases on the actual data collection processes.
The algorithm may not even be trained on enough data that can represent the actual scenario that the AI system is expected to operate in. For example, there have been instances where algorithms were trained on data pertinent only to Caucasians. In those situations, the systems have ended up generating outputs that are racially biased.
Similarly, an artificial intelligence system may be trained on data sourced from and about one region, while the system is intended to be used worldwide. It would not be surprising that such a system would generate biased outputs.
What are some examples of algorithmic bias?
Some AI systems hold rather dangerous biases. In one horrifying example, Google Vision Cloud, using computer vision, labeled two images of people holding hand-held thermometers. One of the images had a dark-skinned person holding the thermometer, while the other had a light-skinned person holding the thermometer.
While the image with the light-skinned person was labeled “electronic device”, the image with the dark-skinned person was mislabeled “gun”. Such a bias is horrific and unacceptable.
Fortunately, Google has updated its algorithm since then.
In another instance, Google Photos’ image recognition algorithms classified dark skinned people as “gorillas”. In a pathetic attempt to fix the issue, Google just stopped the algorithm from identifying gorillas at all.
When researchers at Princeton University used off-the-shelf machine learning artificial intelligence software to analyze 2.2 million words and create associations among them, they found some shocking results.
In the experiment, words like “girl” and “woman” were highly associated with arts while words like “men” were associated to a greater degree with science and math. The algorithm even perceived European names to be more pleasant than names of African-American people. The machine learning algorithm picked up on existing racial and gender biases demonstrated by humans. If the learned associations of these algorithms were to be used as part of a search engine ranking algorithm or to generate word suggestions as part of an auto-complete tool, it could have the terrible effect of reinforcing racial and gender biases.
In yet another example of algorithmic bias, Amazon had to discontinue the use of a recruiting algorithm after they discovered a gender bias in it. The data that was used to create the algorithm was pulled from resumes submitted to Amazon over a 10-year period, the majority of which came from white males. The algorithm then started recognizing word patterns in the resumes instead of relevant skill sets, and these data were benchmarked against the company’s majorly male engineering department to determine how good of a fit the applicant would be. Because of that, the artificial intelligence software started penalizing any resume that had the word “women’s” in the text. It downgraded the resumes of women who attended women’s colleges, thus resulting in gender bias.
Algorithmic bias has also been found in online ads. Harvard researcher and former chief technology officer at the Federal Trade Commission (FTC), Latanya Sweeney found out that online search queries for African-American names had a greater likelihood of returning ads to that person from a service that renders arrest records, as compared to the ad results for white names. She even discovered that the same differential treatment occurred in the micro-targeting of higher-interest credit cards and other financial products when the system inferred that the subjects were African-Americans, even if they had similar backgrounds as whites. She even demonstrated how a website, which marketed the centennial celebration of an African-American fraternity, received continuous ad suggestions for purchasing “arrest records” or accepting high-interest credit card offerings.
How do you fix an algorithm bias?
Diversity is the key to solving any algorithmic bias. Diversity efforts should be taken at every step of the project or process.
While creating artificial intelligence systems, you need to make sure that the training data properly represents the actual scenarios that the algorithm is intended to be used in.
AI ethics should be stressed on in organizations that build AI systems as well as in educational institutions. Organizations that build AI systems should focus on educating their employees on ethics and cultural differences. Everyone who builds, researches, or works on artificial intelligence systems should be aware of the dangers of algorithmic biases and work to avoid and correct them.