03. How It Works
Sources of bias
Historical bias in training data:
If the training data reflects past human discrimination, the model learns to replicate it. The COMPAS recidivism algorithm trained on criminal justice data in which Black defendants were arrested and incarcerated at higher rates due to structural inequities in policing, not because of higher actual recidivism. The model reproduced this disparity.
Unrepresentative data:
MIT researcher Joy Buolamwini found that commercial facial recognition systems trained on datasets that were more than 75% male and more than 80% white produced error rates above 34% for darker-skinned women, compared to under 1% for lighter-skinned men. The training sets simply contained far more examples of one group.
Proxy variables:
An algorithm may not receive race as an input and still produce racially discriminatory outcomes. Zip code is a well-documented proxy for race in the US because of residential segregation. Amazon's same-day delivery exclusions initially excluded predominantly Black neighborhoods while optimizing on zip-code-level profitability. Barocas and Selbst define these as "mere stand-ins for protected groups."
Label bias:
If the ground-truth labels used in training were themselves the product of biased human judgment, the model learns to replicate that judgment. A hiring model trained on which candidates were hired in the past learns the hiring manager's preferences, not an objective measure of job fitness.
Feedback loops:
A biased prediction shapes the world it predicts. Predictive policing tools send more police to areas they flag as high-risk, which produces more arrests in those areas, which confirms the algorithm's prediction in the next training cycle. The bias amplifies over time.
Fairness definitions and their conflicts
Demographic parity (equal selection rates across groups): The system selects applicants, defendants, or patients at the same rate regardless of group. This ignores genuine differences in base rates.
Equalized odds (equal true positive and false positive rates across groups): The system is equally accurate for everyone. ProPublica used this framing to criticize COMPAS, showing Black defendants were more likely to be falsely flagged as high risk.
Calibration (equal predictive accuracy at a given score across groups): A risk score of 70 means the same probability of reoffending regardless of race. Northpointe used this framing to defend COMPAS.
The formal impossibility result states that calibration and equalized odds cannot both be satisfied when base rates differ between groups. Any choice between them is a value judgment, not a technical decision.