Glenn Stovall

Posted on Jun 16, 2020 • Originally published at glennstovall.com

How software can be racist (and what you can do to stop it)

#machinelearning #career

Quick top-hat: This article originally appeared on glennstovall.com; You can also listen to this essay on the Production Ready podcast.

Software is powerful and can shift the balance of power in society. Look at how social media has impacted how we talk about politics, and everything else for example. If politics is the negotiation of power in society, then software is inherently political. If software is powerful, then we, as people who make software, have power. Software can be racist, and as software developers, I believe we all have a responsibility to use our power to prevent racist software from existing.

If this topic makes you uncomfortable, I ask that you don’t stop just yet. Pause and ask yourself why you feel that way. We all have decisions we have to make, and these topics are unavoidable. Get comfortable being uncomfortable.

Part of why conversations about race are challenging is that we aren’t working from a shared vocabulary. So, I want to clarify a few terms upfront. I’m borrowing my terminology here from White Fragility: Why It’s So Hard for White People to Talk About Racism.

The difference between prejudice, discrimination, and racism

Prejudice consists of thoughts, feelings, and generalizations based on little to no experience and then projected onto a group of people. Everyone has prejudices, whether its against people of a certain race, gender, where they are from or where they went to school. If you deny that prejudices exist in others or yourself, then you are powerless to correct them. Prejudice is internal.

Discrimination is an external action based on prejudice, including ignoring, exclusion, threats, ridicule, slander, and violence.

What distinguishes Racism from discrimination is that it is backed by legal authority and institutional control. It comes from a system of power that acts independently and above the actions of any individual, company, or piece of technology.

Racism is hard to define and hard to talk about. It isn’t just people in white hoods committing acts of violence. It’s complicated, and it’s vague. You may imagine “racist” as someone who openly does and says things targeting people. But racism doesn’t have to be overt. It can be subtle. It can happen unintentionally. That’s why it’s easy for people in places of privilege to ignore. Like prejudice, if we don’t do work to acknowledge, identify, and talk about racism, we can’t do anything to stop it. “The greatest trick the Devil ever pulled was convincing the world he didn’t exist.”

Any act of discrimination that props up this system and maintains it as the status quo is de facto racist. Understand when you work on software, it could be racist, and you have a responsibility to prevent that from happening.

How can technology be racist?

When we build software, we make conscious and implicit choices about how we imagine the end-user, what data we will use, and how the algorithms work. Since everyone has unconscious biases and prejudice, there is a possibility they influence said decisions. Coding biases into software is known as “algorithmic bias.” This bias is most apparent when we look at attempts to use machine learning classification algorithms on other people, such as facial recognition.

When engineers on the Google photos team built a system to tag photos automatically, it tagged black people’s faces as ‘Gorilla’ or ‘Chimp.’ Google’s response was to remove those words as valid tags instead of fixing the root cause.

In her TEDx talk, Joy Buolamwini speaks about the discrimination she faced using facial recognition software. It couldn’t recognize her at all. Joy ran into this problem with multiple systems. She discovered they were all based on the same open-source libraries and data sets. I’ll give the developers that built these tools the benefit of the doubt and assume they didn’t mean to create software that ignores black people. But it doesn’t matter. In this situation, the outcome is more important than the intent.

Software can also enforce existing discrimination polices. If your company has a history hiring people of a particular race or gender, and then use their resumes as training data for software that judges incoming candidates, you’ll end up with a system that has the same prejudices. Amazon did exactly this when they built a system that discriminated against women in the hiring process and later faced legal action because of it.

Police departments have used a software tool called Compas before that claims it used to identify suspects and predict recidivism rates. Studies show that it’s not only more likely to be incorrect about black suspects but more likely to be used against them. Here there is discrimination not only in the software itself but how the customers of the software use it.

Racism in software can also take the form of digital redlining. You abstract away discrimination by making decisions based on zip code. Delivery services such as Uber Eats and Doordash are less likely to service predominately black neighborhoods. Those neighborhoods, on average, have fewer Pokestops in Pokemon Go. The data doesn’t have to include specific data points about race to disproportionately affect people of race. Machine learning is all about finding patterns, after all.

See how it works? When the data doesn’t explicitly involve race, everyone who built these systems is allowed to either remain ignorant or has plausible deniability about how their software might affect minorities. We have to assume that what we build could potentially have negative consequences.

Software can’t abstract away the humans behind it

We can’t hold on to a romantic view that technology is purely logical and immune to humanity’s shortcomings. We can’t assume that since computers don’t think and feel, they can’t be biased. Or that we can overcome our biases by being “data-driven.” Our biases shape the design of software, and that software repeats our mistakes. Data can never be entirely objective; our biases, which bears repeating, everyone has, will seep in via decisions influencing what data we collect, what data we ignore, and how we interpret it.

Technology is a reflection of who builds it.

According to a 2011 study by the National Institute of Standards and Technologies shows that facial recognition software built by Asian software companies was more likely to be accurate at identifying Asian faces. Who creates the software has an impact on how it works.

More diverse teams equal fewer blind spots and fewer errors like the ones above, making it to production, or worse, the news. Diversity is a competitive advantage. Creating a more diverse team isn’t about doing it to be “fair”; it is about bringing more information and insight to that table than homogeneous teams can.

As Rachel Goodman, a staff attorney for the ACLU’s racial justice program, told Fast Company: “Many of the ill effects are not intentional. It comes from people designing technology in closed rooms in close conversations and not thinking of the real world.

Do you remember Tay? Microsoft’s attempt at an AI-powered Twitter bot? In 2016, Microsoft launched a Twitter bot that would model its behavior based on the tweets of people that interacted with it.

I bet you can guess where this story is going.

“Tay is designed to engage and entertain people where they connect with each other online through casual and playful conversation,” Microsoft said. “The more you chat with Tay, the smarter she gets.”

You can’t invite a group of people to do anything on the internet, without attracting trolls. In less than a day, Tay was spouting rhetoric about how “Hitler was right” and “9/11 was an inside job.” Microsoft has to shut the project down in short order.

gerry

@geraldmellor

"Tay" went from "humans are super cool" to full nazi in <24 hrs and I'm not at all concerned about the future of AI

05:56 AM - 24 Mar 2016

This story is more of a warning of giving trolls opportunity than anything else; the main lesson still stays the same: artificial intelligence only knows what we tell it, and only acts based on our instructions.

As software developers, what can we do?

Systemic racism is complicated and insidious. It doesn’t come from one place. It comes from everywhere. It happens automatically from systems put in place before we were born and continues to spread through American society through hateful acts and maintaining an inequitable status quo.

As software developers, we can shape the tools that directly impact people’s lives, or if they get built at all. This power means that we have a choice to make: Are you going to take action, or are you going to be complicit?

There are no two ways about this decision, nor is there any avoiding it. The work you do has an impact on the world, full stop. The nature of that impact is up to you.

Outside of your job, there are actions that anyone can take, regardless of job title. Get out in the streets. Donate your time or money. Vote.

As a software developer, you can advocate for change at your job. We need to be more skeptical of our data sources and cynical about the use cases and effects of what we build. So you should ask yourself:

Are the data sets you are using accurate? How can you be sure?
In the worst-case scenario, are there ways our software could disproportionately impact a specific group of people? (This includes accessibility, by the way.)
Do we hear a diverse set of voices, both before and after building our product?
If you are a manager, are you hiring a diverse team? And if not, why not?

These are the types of questions we all need to be asking more. Technology is not amoral, and it is not apolitical. Despite all of these stories, I still believe that’s influence can be used for positive change. But that can only happen if we collectively start doing business differently than we have up to this point. So I’d like to take this opportunity to ask you:

What are you going to do differently?

Oldest comments (14)

David Sullenbarger • Jun 16 '20 • Edited

old guy here: we never saw it as race related or anything even remotely related to that stuff, it was used for it's unambiguous meaning (notice how hard they are to replace in some use cases?)

However, now that I've been exposed to the idea, I can't get it out of my head and agree that it needs to change (but it still annoys me a little)

(also, hello neighbor!)

Rene Padillo 🇵🇭 • Jun 17 '20 • Edited

agree. for me this is just the works of politics being integrated in software development.
being PC does not solve anything with what we are doing as developers.

David Sullenbarger • Jun 17 '20 • Edited

No, it doesn't solve anything related to developing software but developer's live on the same planet as everyone else and why die on this hill? It's not even an interesting hill to us (as developers).

Copyrighting a freaking API is a much more interesting hill. One that I'd be thrilled for the chance to fight Oracle to the death over :-)

Pacharapol Withayasakpunt • Jun 17 '20 • Edited

You mean softwares are, to some extent, centralized in on "human languages", according to client side, in particular, English?

Also, it is not only data-driven, but also tech giants driven, which their creators are largely Americans.

Otherwise, I don't even associate softwares with English, and the history behind its native speakers.

Glenn Stovall • Jun 17 '20

Most software companies being American isn't something I had thought about, but you're absolutely right.

Sam Markham • Jun 17 '20

Excellent article, thank you!

Ben Sinclair • Jun 17 '20

I think there's an ironic bias in your article where you imply the audience for the article is entirely made up of Americans :P

It doesn’t come from one place. It comes from everywhere [...] and continues to spread through American society

Also, your link to "donate your time and money" doesn't work.

Glenn Stovall • Jun 17 '20

Thanks for pointed out that error, I've fixed the link.

And your totally right. I've only ever lived in the US, and its easy to think of the US as the default. Even though I can look at the analytics for the podcast and see that 1/3rd of the listeners live in a different country.

On the other hand, I have no idea about how race relations and police behavior are in other countries. I'm not an expert on any of this, and all I know is what I going on in my country, and what I see going on downtown where I live every day.

Thinking about writing and speaking to an international audience is something I'll have to work on.

Stephanie Morillo • Jun 17 '20

Excellent article, Glenn, thank you!

I have a particular interest in this space. In 2016, I was the editor for the race section of the Responsible Communication Style Guide, which is meant for technologists who want to use precise and less harmful vocabulary that negatively impacts (or propagates stereotypes of) marginalized groups.

Other resources include:

When exploring "master/slave" terminology in MySQL earlier this year, I learned from a program participant that the earliest mention found of "master/slave" in an engineering context was in 1904.

It's incredibly important that we continue to analyze the language we use for tech terms as language evolves; in 2020, we don't use the same terms in the same context that people who lived in the 19th century do. And as devs we're already accustomed to paradigm shifts in programming languages. New concepts are introduced and with them, new labels. Updating our language is a reflection of this understanding as well as respectful of our fellow humans, since all the code we write is meant to impact real people in some fashion.

Glenn Stovall • Jun 17 '20

The communication guide looks awesome! I'm going to have to dig into that some more. Word choice is important and I'm glad some people have put real thought into it and put together a resource like this.

Philip Oakley • Jun 17 '20

Thank you for the engineering context paper. I'll look that one out.
Often the social problems are with mis-use of terms as euphemisms for the local cultural bias ("what school did you go to?" - Saint X means they're Catholic, hence not employed here...). It's often these little 'hints' that quickly build to the misplaced biases of sexism and racism (and the others).

Mike Talbot ⭐ • Jun 17 '20

To make a point of discussion: are you hiring a diverse team -> unfortunately, no not particularly in my case. This is the problem with long term systemic "racism" and "sexism" that means that there just aren't as many people from certain backgrounds that have the skills and experience. Only time and consistent positive pressure will change this and the changes happen inconsistently across the world.

My recruitment catchment is predominantly white, within the BAME (and female) communities there are less people who have considered technology as an appropriate career in the past. If I run a meritocracy on skills there will be less people proportionally from those groups than even the already low population split. So perhaps I don't want a meritocracy, my users don't live where my business is headquartered and I want to reflect them, not my hiring pool. So we try, but it isn't easy. Having an internal recruiter from the community is the only thing that helped - but sadly he moved to a different area of the country and a new role. It's hard to find another like him... yet another example of the underlying issues.