Algorithmic bias and how to avoid it in software projects

| Sieuwert van Otterloo | Artificial Intelligence

Many software projects use Artificial Intelligence algorithms to make important decisions. if these decisions are incorrect, this can have important consequences: people will be refused service or have to wait or take extra steps. Recent research has found that several AI algorithms used for face detection are biased. This ‘algorithmic bias’ or lack of ‘algorithmic fairness’ is a severe problem for software project implementation in the real world. In this article we explain the problem and how to handle this from a project management perspective.

What is algorithmic bias

Algorithmic bias means that the performance (accuracy) of an algorithm is not uniformly distributed, but is actually much higher for certain input than for other input. Algorithmic bias is an issue when algorithms are used on human subjects and the people using the algorithm results are not aware of the bias.

A very concrete algorithmic bias example was discovered by Joy Buolamwini around 2016, when she noticed that facial detection software seemed to work much better on lighter skinned people than on darker skinned people. She investigated this discovery with colleague Timnit Gebru and published the results in 2018 in a paper called ‘Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification’. In this paper, she investigated the accuracy of three software libraries: Microsoft Face Detect, IBM Watson and Face++. They tested the algorithms on a subset of photos containing people’s faces. The algorithms were used to predict whether the person was male or female.

Each algorithm had the correct answer between 87% to 93%. In other words, on the whole set AI is 90% accurate. The best results however where reached for lighter skinned men. On  the minority-subset of darker skinned women, the algorithms are only between 65% and 80% accurate: a much lower score. The algorithms’ accuracy is thus biased against women and against people of color. While 80% accuracy might be good enough for some applications, it is definitely not enough in many applications and can lead to disastrous results.

The consequences of bias

This bias is problematic when AI algorithms will be in practical applications. Non-working AI leads to additional work (if results have to be corrected), unintentional fraud (if people submit results prefilled by AI), incorrect accusations of fraud (if AI is used to check results) or even accidents and death (if AI is used for autonomous vehicles). The fact that these AI risks will be larger for some groups than for others, is unfair: it will disadvantage many individuals in supposedly fair situations. The fact that only some people will experience the bias and other will not, adds insult to injury: there is a high probability that complaints from the victims will not be taken seriously when handled by people not aware of algorithmic bias.

Why are algorithms biased?

The main reason that algorithmic bias exist is because AI solutions consist of algorithms that are being trained and evaluated with data sets. These data sets are often not diverse enough for practical reasons: they could for instance be made by computer science / data science students, be from biased sources or are generated using biased algorithms from biased sources. Using non-diverse data sets leads to a false sense of accuracy: the algorithms will work well on these data sets, but fail in the real world.

How to address algorithmic bias in software projects

If in a software project you are using AI, you must make sure human diversity is taken into account. The following steps will help you prevent algorithmic bias:

  • Carefully phrase detailed quality criteria. Do not just measure overall accuracy, but measure and try to understand accuracy on subgroups to make sure the algorithm works for anyone.
  • Make sure training sets are diverse and unbiased. Using the right training set is more important than the right algorithm. If no diverse training sets exist, consider making and publishing a diverse data set.
  • Test the resulting system on a diverse group of subjects, both in lab and in the field. Field results are often not as good results in lab settings since lab settings (noise, lighting) are often optimized for the task at hand.
  • Do not implement automated decision making without human safeguarding. This is not just sound advice, this is a legal requirement under EU privacy law (see the rights of subjects definition). AI is not perfect and one cannot build systems assuming that it is perfect.
  • Provide transparency on algorithms and test data. People have a right to know how decisions are made and how AI is used. This is true both for the subjects who are being classified but also for the people using the system data. They need to understand bias.
  • Ask consent for use of AI and provide a mechanism for questions and complaints. Again this is a requirement for processing personal data under GDPR: you need to address the rights of data subjects.

More information

More information on the topic of Algorithmic Bias can be found at the Gender Shades website, including video material and the full paper. For more information on managing software project management, check our full series of blog posts. This series was created for the course Software Project Management at the Free University of Amsterdam. The full series in suggested reading order is:

Images: from gender shades.org

Author: Sieuwert van Otterloo
Dr. Sieuwert van Otterloo is a court-certified IT expert with interests in agile, security, software research and IT-contracts.