Algorithmic bias can originate from three sources (at least):
- Data – data is skewed or imbalanced, leading to skewed or imbalanced learning. (This is about what the algorithm learns from.)
- Algorithm – algorithm focuses on central tendency, resulting in errors in edge cases. (This is about how the algorithm learns.)
- Human – human sets non-optimal hyperparameters. (This is also about how the algorithm learns.)
In the first case, data can be skewed or imbalanced due to several reasons:
- The world is biased – the data just accurately reflects this.
- The world is sampled wrongly – there is bias in the data collection process.
- A biased model generates biased data that is then used for learning.
