On Finding and Fixing Latent Racial Bias
The recent protests which unfolded across the United States (and more recently across the World) reminded us how important it is to acknowledge and resolve both unfair and undue bias from society.
It’s important that events like these teach and remind us to take a look at what we create to ensure that we don’t make the same mistakes as the mistakes made before us.
Models that are developed (and currently deployed), unfortunately, can also fall foul of these biases. Now, this is an egregious thought but the following examples show that the problem is real:
- What if an A.I. job candidate selector preferred a name from a certain gender?
- What if autonomous vehicles were less safe around a certain race?
- What if your image classifier changed labels with race?
- What if an airport facial scanners discriminates against colour?
Isn’t this exactly what we don’t want?
When I studied Machine Learning, I was taught about methods to remove latent bias by, for example, ensuring the importance of ‘balanced datasets’ and ‘to ensure your training dataset is representative of your domain’ but often in the process of increasing your accuracy, we can fall foul of developing a model that doesn’t generalise well.
As an example, say your model is attempting to predict who should be the next United States President from a list candidates. If you look at the training data set for this problem, well, the United States has never had a female President. So if the gender of the individual is an input into the model, could this cause an unfair bias? What about their name? Or even their height?
The problem here is that if the domain of the problem changes, can our models recognise it and cope?
This was really exemplified by the conversational chat-bot developed at Microsoft and deployed on Twitter, Tay bot. This conversational agent was a massive technical leap forward but despite this, the progression was overshadowed by its egregious replies.
On one side, many reporters noted (here) that the conversational bot made these terrible replies in response to terrible questions: but replies such as “Jews did 9/11” is nothing but awful and in reality, the model should have stopped well short of making such accusatory and dangerous comments.
Yes, AI agents have the ability to discern understanding from complex patterns but falsely labelling or recognising a phenomena has to be fixed in this space. There are just some things we can’t afford to get wrong.
In what follows, I cover a few methodologies to remove these biases from machine learning datasets, and how we these rules can be implemented.
Method 1: Incorporating Adversarial Inputs into Training
Work by both Google and OpenAI have highlighted that even the best of ML models are susceptible to robustness issues: which is to say that if you slightly alter the input to the model, your output can differ greatly.
Take the following example from OpenAI:
Here we see that a latent phenomena (in the form of white noise) is not only changing the classification of a sample, but this phenomena is also altering the confidence in its suggestion.
Now, this error can happen because the model is being trained on data in such a way that (a) it overfits the training sample but also (b) the model is unable to generalise well.
These issues are widely known and managing them requires adding equal parts robustness measures and adversarial examples into your training procedures. These can be of the form of:
- Altering your training samples to create more training samples (i.e. rotated images should be equivalent). This solution is naturally domain specific but the researcher should think responsibly and really hard here. Should a CV scanner really treat identical CV’s of different genders differently? Should gender even be a feature here?
- Sampling techniques to ensure your training dataset is balanced
- Optimisation fitting methods to become more robust and more stable around slight changes in the input space.
The above listed methods are not exhaustive and nor do they intend to be. Training an ML models is domain specific but the methods above can certainly help in removing latent bias.
Method 2: Create tests and actively use them
There are some mistakes that we can live with, but there are others that we simply cannot.
Put simply, some models cannot show bias and should be tested in anger.
Models that contain parameters for features such as Race, Gender and other discriminatory angles should be addressed. In more complex models (such as in deep learning methods), a “never-wrong” data set should be used to ensure that the model is responsible and cognisant. It’s a minimum requirement, and we owe it to society to have one.
Models are often incomplete and problems may slip through the cracks, but if the model maker is not held accountable and the construction of these models is not improved to account for these known issues, then the problem will continue.
Yes we expect our test set to be representative of the domain but if our model does unexpected things (as in Microsoft’s Tay bot), then the fault is still with the model maker and the model is not fit for purpose.
In these really weird instances, I’m a firm believer of being more conservative. As per lessons learned in the bias-variance trade off: sometimes being a bit more conservative in attempts to estimate can result in a significantly better outcome not just for some, but for everyone.
Airport scanners should not discriminate on colour. They just shouldn’t.
Method 3: Surprise Detection and ‘Unknown Unknowns’
Another key issue is that a machine learning model has to understand what its remit is. Take the example from Figure 1 above: by adding noise to the problem, the confidence in its guess actually increases! This tells us that either the measure of confidence in this space is clearly miscalculated, or, the model can easily be confused.
Generally in image classification, it’s quite common to use monochrome images however, this process removes a number of data points which are incredibly valuable. Colour, for example, is just one of many features that you can use to discern an anomalous sample.
Anomalies will always exist and we need to integrate more ways to deal with them.
When you move from training to cross validation to testing, the dynamics of the domain can shift. Practitioners often assume that the underlying dynamics of the domains are all the same e.g. the sampling distribution of the age/colour/gender remains the same but we know that this is not always the case. For example, some training sets only contain young people, whilst the training set may only contain old people.
Given this reality, it would be preferred if an algorithm was to say “I can’t process this job applicant as I don’t recognise line 10 on their C.V.” rather than scoring the sample incorrectly. An incorrect score can result in both positive and negative discrimination which simply isn’t right.
By ensuring that you actively measure latent dynamics in your problem space and measuring them to the domain that your model was trained on is not only good practice, but it could save a lot of pain down the road. What use is an english-to-french translation engine if you now input german phrases? Likewise, if a feature is not familiar, the difference should be measured and acted upon.
We cannot let the same problems that exist within society effect the way technology develops. Given that we recognise that these problems exist, we have a responsibility to ensure that these models are not deployed until they are proven to be unbiased.
The European Commission released a study in early January 2020 highlighting a significant number of issues and challenges facing the deployment of AI. The examples at the beginning of this article echo this study but ultimately, a lot has to be done to fix these problems.
We need to constantly measure and improve our technology. We have to be responsible for this change.
Thanks for reading! Please message for any questions!
Keep up to date with my latest work here!