Garbage in, garbage out: is AI discriminatory or simply a mirror of IRL inequalities?

When considering the rise of artificial intelligence (AI), it is useful to remember Tay, an infamous Twitter chatbot launched by Microsoft in March 2016. Tay was an artificial intelligence (AI) chatbot intended to ‘learn’ by reading tweets and interacting with other Twitter users. ‘The more you talk, the smarter Tay gets!,’ its description read. It only took a few hours before Tay, tricked by social media users, started posting offensive — including sexist and racist — tweets such as ‘I fucking hate feminists and they should all die and burn in hell’ and referring to US President Barack Obama as ‘the monkey.’¹ Microsoft disconnected the chatbot within 24 hours of its launch.

Tay could simply have been a notable exception or a mistake of AI programming. Yet, when it comes to AI these ‘mistakes’ seem to be the rule rather than the exception. There are plenty of other examples of racist and sexist artificial intelligences, often in more subtle ways than Tay. Meanwhile, these other forms of AI can have more detrimental consequences on peoples’ lives than conversations with a chatbot. AI will be — and sometimes already is — taking important decisions that affect our lives in the fields of policymaking, education, justice, social security, defence, and hiring, to name a few, rendering it crucial to understand the discrimination programmed into these systems. With the rapidly changing digital environment, we must ensure that technology does not nullify decades of struggles for human rights, dignity and equality.

Why is AI discriminatory? Because of its learning process

The discriminatory nature of AI can be linked to its functioning, in particular to its machine learning and word embedding processes. Machine learning is the process through which AI is capable of performing tasks without having been explicitly coded to do so. Instead, it ‘learns’ from patterns found in enormous data sets that it has been fed. These patterns are ‘word embeddings [,] algorithms that translate the relationships between words into numbers so that a computer can work with them’² such as ‘A is to B as X is to Y.’ The issue here is that computers are not capable of discriminating between benign relationships, and relationships that pose ethical and moral problems: all they do is look for a pattern. Since machines learn by ‘reading’ material that is available to them (like Tay on Twitter), ultimately they form relationships such as ‘man is to computer programmer as woman is to homemaker.’³

Biased machine learning can have tragic consequences. During the 2010s, the United States used a programme called the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) in their criminal justice system to assess risks of recidivism. Without human supervision, COMPAS was recommending longer prison sentences for Black Americans than for white Americans because it had identified a pattern of recidivism based on elements such as ‘residence,’ ‘criminal personality,’ ‘substance abuse,’ and ‘social isolation’ in the dataset from which it had been trained.⁴COMPAS was thus predicting a higher than actual risk of recidivism for Black defendants and a lower than actual risk for white defendants,⁵ exacerbating existing human biases.

Why is AI discriminatory? Because we are

Another reason why AI is discriminatory is quite obvious: the engineers and designers behind AI are generally white men. In the United Kingdom alone, more than 90% of coders are men.⁶ Meanwhile, in Europe only 11.2% of leadership positions in the STEM (science, technology, engineering and mathematics) fields are held by women. This figure rises slightly to a still meager 18.1% in North America.⁷

Due to a lack of diversity in the sector, sexist and racist biases are — consciously or not — incorporated into algorithms and codes that power machine learning and artificial intelligence systems. A well-known example of such biases can be found in facial recognition software: ‘Facial recognition is accurate, if you’re a white guy,’ announced The New York Times. When a photo shows a white man, 99% of the time the software will identify both the complexion and the gender. However, the darker the skin, the more difficult it is for the software to differentiate between male and female. The error rate is the highest for darker-skinned females.⁸

In 2015, Google was criticised for the algorithm it used to auto-label pictures on Google Photos since the system had identified dark-skinned men as gorillas. It seems that Google was unable to find a viable solution to this problem, and instead simply decided to block the words ‘gorilla’, ‘chimpanzee,’ and ‘monkey’, so that it is now impossible for the algorithm to label pictures as such.⁹

The lack of diversity also influences the design and names of robots: most humanoid robots have white ‘skin’ and are highly gendered — with gendered names, pronouns, voices, and appearances. Often, one can guess the role of the robot because of its ‘gender’ since ‘male’ robots tend to be used in the army while ‘female’ robots generally serve as personal assistants. Female robots are also sexualised, designed with wide hips, a narrow waist, and a sensual voice. For instance, Valkyrie, a robot created in 2013 and used by NASA, possesses breasts. One can wonder about their functionality.

Why is AI discriminatory? Because of copyright law

The impact of copyright law on biases in AI¹⁰ was uncovered by Amanda Levendowski, Associate Professor of Law and the founding Director of the Intellectual Property and Information Policy (iPIP) Clinic at Georgetown University. Since computers need access to texts, works of art, photographs, videos, books, and other documents to learn and analyse patterns, copyright laws complicate and limit access to these materials. As a result, by shrinking datasets copyright laws also limit the worldview of AI systems.

Meanwhile, works in the public domain — in other words, works that are old enough to no longer be protected by copyright or recent works that are free from copyright protection — are limited. Levendowski states that, in the US, most of these works were published before 1923. Building human rights compliant AI systems using this data would be difficult since before 1923 the concept of human rights was not clearly defined, if it existed at all. The UN did not exist, the Second World War had not yet happened, and social movements were still forming. Before 1923, the word ‘queer’ was used as an offensive term to designate homosexuals. Therefore, an AI relying on works published prior to this date would not be exposed to the word’s current use, which has been reclaimed by the LGBTIQ+ community. In fact, such an AI would fail to recognise the acronym ‘LGBTIQ+’ altogether.

Unfortunately, even if AI systems were able to rely on public domain material, issues of discrimination would still persist. For example, one survey conducted in 2011 found that 90% of editors on Wikipedia are men. This means that, by using such public domain material, AI systems would still acquire implicit biases in its algorithms.

Therefore, even if a machine is built without any explicit biases and by a diverse team, it could still ultimately make discriminatory decisions. Considering that data is the basis of its learning process, this is caused in particular by access to biased or old data. The outcome would be no different than having an explicitly biased system in the first place.

How to make AI more inclusive?

More diversity in the STEM and AI sectors would certainly help to make AI less biased and thus, less discriminatory. In particular, women, people of colour, and minorities should be encouraged to pursue a career in these fields. In this regard, some initiatives already exist: Girls Who Code, AI4ALL, Re•Work. On another note, technology, and computer sciences departments in post-secondary institutions should offer courses on human rights, notably on issues related to gender and race. In addition, copyright laws should be lifted for AI because this is the only way to ensure AI systems gain access to more inclusive and diverse data, which can in turn limit implicit biases.

A repeat of chatbot Tay could be avoided if greater attention was paid to ensure that machines are more inclusive and the risks of racist and sexist algorithmic patterns are properly addressed. However, this responsibility lies with AI engineers and designers, who must create new algorithms while ensuring that computers do not reiterate or amplify existing stereotypes. Furthermore, they must also build machines that are more neutral and free from gender stereotypes.

Nevertheless, changing algorithms and appearances is not enough. There should also be more transparency in the design process of systems, which is necessary to guarantee respect for human and digital rights. To ensure proper oversight, there should be reviews and assessments conducted by human rights experts and defenders specialised in computer science, while information on algorithms used by these machines should be made readily available. In addition, there should be a clear process by which individuals can submit cases of AI discrimination to courts and obtain redress.

Ultimately, as artificial intelligence systems are becoming smarter and playing a bigger role in our societies, the limited field of Human-Computer Interactions¹¹ must expand into a formal field of study in universities and occupy a stronger presence in the human rights sector. In particular, far more attention is needed on the human rights implications of these emerging technologies, and their potential to exacerbate and mirror existing inequalities. Considering the dangers of AI exhibited by the chatbot Tay, the international community must act promptly to ensure that human rights compliant algorithms govern the rapidly expanding digital space.

¹ Women vs. the Machine: Is AI Sexist?

² He’s Brilliant, She’s Lovely: Teaching Computers To Be Less Sexist

³ How AI Learns to Be Sexist and Racist

⁴Machine Bias

⁵ When an Algorithm Helps Send You to Prison

⁶ Reworking the Gender Balance in the AI

⁷ Women vs. the Machine: Is AI Sexist?

⁸ Facial recognition is accurate, if you’re a white guy

⁹ Google’s Solution to Accidental Algorithmic Racism: Ban Gorillas

¹⁰ How Copyright Law Can Fix Artificial Intelligence’s Implicit Bias Problem

¹¹ Women vs. the Machine: Is AI Sexist?