This Dot

The Future (of AI) is Female: How Hiring Bias Mitigation in NLP Can Be Great for Women Now and in the Future

ladyleet profile image Tracy Lee | ladyleet ・9 min read


AI technology is often billed as an answer to the physical, and mental shortcomings of the human brain, and productive capacity. We think of their processes as being wholly objective-- separate from human bias, and prejudice, without considering that machines are only able to learn from the data that we provide them. As AI powered technologies continue to permeate every industry, organization, and social structure, we are seeing the negative influence which our history of deeply encoded gender bias has had on contemporary digital innovation.

These problems, at times, seem so insurmountable, that some are even led to question whether Artificial Intelligence will actually damage our progress toward a more equitable society. The intent of this article is to show why Artificial Intelligence may suffer from the prejudice ingrained in human language, especially when it comes to evaluative software like applicant scoring programs, but also why realizing this issue and taking action could be great for women in tech.

Human language, especially written language, has always reflected the biases of historically platformed groups. In the English language, literary history has assigned feminine connotations to many negative terms like “chattering” and “bitchy”, while similar behaviors in men might be described as “gregarious” and “assertive”, respectively. In turn, our machines are teaching themselves this very same bias through Natural Language Processing. As we teach ourselves to deprogram socially ingrained prejudices within ourselves, will we be as diligent for our machines?

Of course, this is a pressing issue, but I believe AI can be one of the greatest tools for combating social inequality moving forward. However, the only way to do this is to begin mitigating the bias inherent to NLP through reevaluating the way that algorithms interact with language that might reflect unconscious or intentional bias in human speech. And the best way to do this is to begin balancing the ratio of men and women who are contributing to the development of these technologies.

Currently, only 1 in every 4 computing positions is held by a woman. And when we look even closer, we see that, among the women who are working as developers, there is a significant disparity in their seniority when compared to their male colleagues. Some might point to this as a reflection of disinterest in STEM among women, but the numbers show us that this is not true. In both 2017 and 2018, women made up roughly 40% of all coding bootcamp graduates. However, we know that a significant proportion of these graduates aren’t always able to bridge the gap from formal education to their first junior development role, and even if women are successful in launching development careers, their male colleagues, at this time, are 3.5 times more likely to be in senior level positions by the age of 35.

AI has the capacity to be one of the most integral tools in eliminating human bias. It is our responsibility to ensure that learning algorithms are not teaching themselves the same sorts of problematic ways of thinking that make objective, evaluative software so valuable to human progress. This however, is not simply a problem of addressing technical shortcomings, but an opportunity to empower femme developers who will bring not only their technical talent, but their experiences as women, to be the arbiters of how NLP is succeptable to negative gender bias. It is time for project managers, and C-level executives to take a step back and evaluate whether or not their teams are demographically balanced, and if their team’s structure is built for uplifting junior developers, where an overwhelming proportion of women find themselves perpetually stuck.


Human language is perhaps the most critical way that gender bias is perpetuated and reinforced within culture. Antiquated Western stereotypes about the roles of men and women inform the unconscious associations we make between words and gender. Words that reflect communal or collaborative values have become associated with women, while words reflecting industrious traits are often assigned to men.

This wouldn’t necessarily be a problem if it weren’t for the fact that deeply ingrained social imbalances between men and women, reflected in the way that different types of work, and the way we describe work, have been sexed, and subsequently valued against other types of work or traits.

This has created semi-conscious value differentiations between words that describe behaviors that society associates with femininity, and those that describe masculinity. It is a textbook example of the Whorfian Hypothesis, which states that language is a reflection of our social values.


When thinking about how machines learn, I am reminded of an Introduction to Philosophy class I took while I was in school. I don’t know if this is a common thought exercise for college underclassmen, but one of the essay prompts asked us to make an argument for whether or not it is ethical to “kill” a computer.

Of course, I’m sure the professor would have accepted any compelling argument, but she seemed to be partial to the idea that a computer is not so unlike a human mind. Computers, like humans, receive input, reference the functions and processes that make meaning out of that input, and produce output- whether those processes are encoded by a scientist, or by our lived experiences, is maybe not as important as we might believe it to be. The difference between these encoded processes is narrowed even further by machine learning, such that AI technologies could theoretically learn in such a way that is so similar to how a human does, that its being, so to speak, could be indistinguishable from that of a human.

Artificial technology is an often misunderstood science. We aren’t creating vacuumous robots that are “born” with some apriori ability to objectively analyze data points. Not unlike a human brain, a machine must also learn through observing the data available to it. So the problem of AI technologies internalizing the same prejudices that permeate society is a completely realistic, and observable, phenomenon.

When we discuss gender bias in AI, we are often referring to a problem that arises within a class of artificial intelligence known as Natural Language Processing (NLP). NLP is a subsection of AI technology that deals with the extraction and analysis of data from unstructured human language. Some might argue that computers can’t “understand” what words mean with quite the same subjectivity and nuance as a human being, but the reality is that what computers are able to do with language is not far off from what we do. They can extract values from words through a slew of different identifiers and context clues, including, but not limited to grammatical, syntactical, and lexical features, as well as the complex contexts and connotations implied by the relationship between words as they appear in human writing or speech. They then use these words to form context analyses about the data in question, and like us, come to conclusions about that input.


Since 2014, Amazon.com had been building an artificial intelligence powered recruitment software to help them quickly find the best talent. The system worked by reviewing resumes for specific keywords informed by over ten years of the company’s hiring data, and ranking those resumes based on their similarity to past hires.

It did not take long for Amazon to realize that this algorithm penalized resumes submitted by women. In fact, those who worked on the project reported that resumes of applicants from women’s colleges, or those whose resumes even contained the word “women” were given less preference by the software. This, of course, is because the overwhelming majority of Amazon’s technical workforce is male, with 2017 stats showing that women make up only 40% of its total workforce. Amazon attempted to mitigate this problem by neutralizing terms that denote demographic information, but discluding select words cannot address the issue of gender encoding within all forms of language. Recognizing the inability to account for all the manners in which such technology could possibly discriminate against certain groups by assigning different levels of value to words with possibly prejudicial cultural imprints, the company eventually discarded the software in early 2017.


There is no easy way to address the problem of gender bias in NLP based Artificial Intelligence. If there were, I think it’s pretty safe to say that some of the world’s largest tech based companies, if only for the sake of public perception (though I would like to have more faith in humanity), would have already implemented these fixes. The problem is much deeper. We need to look at our entire labor culture and ask ourselves, why is it that over half of university graduates are women and yet only 5% of the S&P 500 CEOs are women? Women are pursuing STEM education and are finding themselves pursuing increasingly more diverse professional areas. Yet, by looking at just the tech industry- the very industry that could create software that would mitigate hiring bias with AI- women are not being placed in roles at a rate that reflects the pipeline created by bootcamp education, and when they are, they are not rising in those roles like their male colleagues.

Achieving a solution will not come overnight, but it is not an impossible feat either. Currently, we know that NLP technology needs to be able to draw data from unstructured language without giving value to biases that, through the course of human history, have been deeply encoded onto the language of which the data, referenced by such technologies, is comprised.

The technical solution to this has yet to be discovered. However, it is crucial that, in the pursuit of developing these technologies, women play a commanding role. In many ways, driving the direction of our digital tools is as much a social science as it is engineering. We need to internalize the value that lived experience brings to conversations about mitigating bias through technology. To approach redesigning applicant scoring and recruitment software without creating more inclusive workplace environments for women in tech is placing the cart in front of the horse. Too many C-level leaders focus on achieving diversity metrics rather than fostering inclusive environments. If we start by supporting the success of women in tech, who are historically disadvantaged in this industry, we can begin to create the environments where technological solutions can be born of deeper appreciation for the ways that our past traditions inform the language that we use, the ideas that we propagate, and the workplaces we build.


These bias mitigation technologies will be amazing tools for our children, grandchildren, and all of the talented women who will be entering the workforce over the next decades and beyond. But as of now, NLP software is not developed enough to significantly prevent the same highly insidious biases to which non-augmented hiring processes tend to fall victim.

The truth is that these technologies are not the answer to creating a more equitable tech space. They can be a great tool, but we need to work harder to help women overcome the obstacles that prevent them from accessing necessary educational and work opportunities, and make this industry into a space to which women want to contribute, in which they want to be, and where they feel that not only their talent, but their unique experiences are valued.

It will require companies to contribute more of their time, energy, and resources into making their businesses places where women feel they can receive positive, constructive mentorship, and meaningful routes for advancement. Despite being a web development consultancy, This Dot Labs is doing all that it can to begin giving back to the women who make this industry so great by creating avenues through which companies can invest in uplifting developers from historically underrepresented demographic. As of Summer of 2019, we have launched our Open Source Apprenticeship Program, partnering with several wonderful companies, who recognize the value of connecting talented women with paid opportunities to contribute to their open-source projects.

This solution, however, will require the work of our entire industry to internalize the belief that when our industry is more equitable, our technologies will become more equitable. We owe it to ourselves, to the future of this industry, and to all the millions of people who are using and will use NLP and other AI technologies to enrich their lives.

This Dot Inc. is a consulting company which contains two branches : the media stream and labs stream. This Dot Media is the portion responsible for keeping developers up to date with advancements in the web platform. In order to inform authors of new releases or changes made to frameworks/libraries, events are hosted, and videos, articles, & podcasts are published. Meanwhile, This Dot Labs provides teams with web platform expertise using methods such as mentoring and training.


markdown guide

I think it's wild that people assume AI (which, at this point, means ML) will automatically eliminate bias. Surely they forget that the output is only as good as the input, and the input is our history, with all of its inherent prejudices. I think ML can be great for identifying bias, but removing it seems like a problem on the same scale as removing bias in humans.


People assume a lot of things about AI. One senior dev I know is fully convinced that AI will surpass human intelligence by 2029, and the singularity will arrive in 2045 (a common view, of course). Humans will survive, of course, because we'll have BCI (Brain-Computer Interface) by 2027 and superfast communication networks, which means brain uploading and downloading will convert us into a hive mind.

I don't know what to say. :|


I mean, on the one hand, we have people on Dev hacking with currently existing brain-computer interfaces. On the other hand, lol.


Is there really a problem of bias in NLP based AI? It feels like an arbitrary assumption to support your selective topic "female" to me.


there is, another example is google translate. If you take a phrase in english like "she is a doctor", translate it to a language with no gender pronouns, and then back to english you get "he is a doctor"


This is a case of how our language is used and I don't think simply changing the pronouns will solve all problems. You can have a perfectly gender-neutral language and still have biases in the AI because . . . well, humans have biases (conscious or subtle) no matter what they do!

thats the whole point, we can't claim that AI doesn't have bias because is a machine, when its being trained with data generated by humans with bias

Yup. And the solution is ... honestly, I have no idea! :D :P


I tried and confirmed it you are right, there is a group attribution bias there probably caused by the training data than developers. But I'm interested to know what's your solution for this specific "problem"?

you got it, is not hat developers are mean and create biased AIs, is that the data used to train them has bias and that is picked up by the AI


Just trying to understand here, So what should be the behavior in the ideal case ? (50% of the time he and 50% of the time she ?)

since when we convert a sentence to a language with no gender pronouns there is no way to retain the gender when converting it back to English right ?

Maybe we can tweak our AI engines to add a temporary variable that retains what the pronoun was before translation. :D :P Honestly, I find the whole situation ridiculous.

there is no ideal case, this is not an easy problem to solve. What we can do is stop pretending that this AI algorithms are bias free and always right and fair, they are not.

So we let them rule our lives with biases and ruin the folks who happen to be in the crosshairs (law enforcement, for example)? :|

Look at the example brought to twitter by DHH (creator of Ruby and founder of Basecamp), he and his wife applied for the new Apple Card, but she got less credit than him. This happened even thought they do all their finance together, and she has better scores than him.
As it turned out, the Apple Card is backed by a Goldman Sachs bank and it uses an AI algorithm to make those decisions.
Many other people started sharing similar issues applying for it, including Steve Wozniak, whose wife also got less credit approved even tho she also has a better credit score than him.
When Apple customer service was called they all seem to just say is the algorithm making the decision, and no human has a way to rectify that decision (this is the scary part).
As you suggested, regulation and law is what will be required to ensure we don't hand over decision making to AIs that were trained using biased data (thats all the data we have).
Or, we could look for a way to remove bias from data!

Honestly, I sometimes feel that we should halt the march of "progress" right here. We have enough efficiency and comforts in the world today, and could do without more large-scale destruction. :|


Equity and Equality Are Not Equal! Forcing 50%-50% of men women in workplaces doesn't mean equity and it's not optimal for women and men and society as a whole, I don't know how people have a hard time understanding this!


The problem is not wanting to have 50-50 of men-women, or even equal distribution of ethnicities. The problem, amongst others, is that nowadays many companies, recruiting firms, and HR departments are using AI algorithms to screen job applicants, give them a score, and decide whether or not to move in the recruitment process. As described here in the article the case of Amazon, and others that keep showing up, these algorithms are heavily biased, and will score women or non-whites lower, and also recommend lower salaries. Because thats that the data used to train those models say.


is amazing that people don't realise that by teaching an algorithm with examples you will get whatever bias there is in the data into your algorithm.
many data scientist argued in the past that "math is not biased", its not, but the data you are using to train your model is