Intro to Bias in AI

To understand AI Bias, we need to understand Dataset Bias. Collecting, labelling, and organizing data is a time consuming and expensive effort. Many popular datasets in the artificial intelligence community can take years to produce and publish. This effort requires a large amount of resources, and does not make dataset creation a small or efficient task. Since it’s impractical to create a dataset with all possible permutations and domains, all datasets have some form of bias in them. This limitation in data causes lower performance and decreased generalization across unrepresented domains.

The simple answer is to create more data, but that’s not easy. A better solution is to improve existing machine learning models with existing solutions (e.g., domain adaptation). But before we dive into solutions, let’s review the problem itself.

I’ll go over a few terms that come up often, but there are many more.

Dataset Bias

Dataset bias corresponds to properties that are seen frequently in a dataset. For instance, in the COCO dataset, “person” is the most frequent object category across images. The bias in the COCO dataset is “person”. Biases can make it easier for a human to distinguish between datasets, but they can also often result in decreased model performance (due to overfitting), which can hinder learning reliable features.

COCO dataset has 66,000 labels for “person”. It only has 2,000 labels for “elephant”. The data bias is “person”.

Domain Shift

Domain shift refers to the situation where training data and test data have different domains. For instance, domain shift happens when daytime images are used for training data and nighttime images are used for test data. The bias in the training data is “daytime”. This domain shift leads to lower performance.

Train the model on daytime data (left). Test the model on nighttime data (right).

Suppose you have a neural network that takes images and predicts object bounding boxes and labels. You could deploy it on a self-driving car to identify pedestrians. Let’s assume that the model was trained on a dataset from sunny California.

The detection system can detect pedestrians from California (left), but not from Boston (right).

Let’s say that you deploy the trained network in Boston during the Winter. Since the network hasn’t seen this type of data during training time, it’s very likely that it’s going to miss quite a few objects and quite a few pedestrians in this example. This is the problem of domain shift which is caused by dataset bias. Where the source data is from the source domain, and the target data from deployment is called the target domain. And it has distribution shift over the input data. The distribution has shifted from training to test time.

Domain shift happens in applications when the models are deployed in the real world.


In general, domain shift can happen when the model is trained on one dataset and the trained model is applied to a different dataset. Specifically, a dataset of objects can come from product images with white backgrounds and canonical poses, and the model can be deployed on a dataset collected from a mobile robot where the backgrounds are cluttered and the viewpoints and lighting conditions are very different.

Product images with white backgrounds (left). Robot images with cluttered backgrounds (right).

Skin Tones

Here is a very important problem where we are training a face detection system on data with a bias for light skinned faces. This results in poor predictive performance on images of darker skinned faces. We’d like to improve the performance for people with darker skin tones.

Face detection works on lighter-skinned people, but less on darker-skinned people. [Credit: Joy Buolamwini]


Another example of domain shift is when its training data is biased to a particular modality. For example, the training data is RGB images, and the test data is depth images. Using domain adaptation, we can improve the detection performance on the Depth images.

This uses domain adaptation to improve performance from RGB images (left) to Depth images (right).


Another very common example is training robots in simulation.

Training an object manipulator in simulation and deploying it in the real setting.

In this example, a robot arm is picking up an object. Ideally, we want to train these policies in simulation, because it’s a cheaper source of data that doesn’t damage the robot. However, at test time, we get real images. Therefore, we’d like to be able to adapt to this bias from simulation to reality. (For more information on Sim2Real, check out my Overview on Sim2Real. Coming Soon.)

Dataset Bias in the training data causes poor performance and poor generalization to future test data. The distinction in the kind of images during train and test can render a model useless during evaluation. It causes a significant drop in performance and makes our models inaccurate.

If you train on MNIST data, but test on a different domain (e.g., USPS, SVHN), then you have a significant drop in performance.

For example, if we train on MNIST and test it on images from MNIST, then it should have 99% accuracy. However, if we train on the MNIST domain and test it on the Street View House Numbers (SVHN). We’ll see that the performance will be around 67.1%. This is much lower than it should be, so this is a serious issue. In fact, even between two similar looking domains of MNIST and USPS, there is still a very significant drop in performance.

Solutions for AI Bias exist today. These solutions improve model performance. We get a high level sense of some of them here.

Domain Adaptation

Domain adaptation is one solution for domain shift. It’s a method for adapting a model to a novel dataset with no or few labels from the novel dataset. We assume access to a small set of target domain images during training. During training, you can get a sense of what the target distribution looks like.

Domain Generalization

Domain Generalization is another solution for domain shift. It doesn’t assume that target data exists in the training data. Instead, it assumes no target data at all during training. It’s a strictly zero-shot scenario and relies heavily on model generalization.

Latent Domain Discovery

Latent Domain Discovery helps with Domain Generalization. Some aspects of both dataset bias and domain shift can be easy for humans to infer by observation. But there can be some aspects which are inherently latent and may not be immediately obvious. These latent domains exist but are not labeled in datasets. For instance, images found on the web can be thought of as a collection of many hidden domains. Discovering latent domains can significantly improve generalization and performance.

Image results for “person” consist of latent domains (e.g., “groups”, “silhouettes”, “line drawings”, “close-ups”).

Other solutions not mentioned here include Transfer Learning and Representation Learning. We will cover more solutions in further depth for future articles. In the meantime, feel free to reach out if you have any specific topics you’d like me to cover. Also, if you’re new to AI, this brief Intro to AI should help.

Read More

Luis Bermudez