prediction-bias

As mentioned in the [[Linear Regression]] module, calculating prediction bias is a quick check that can flag issues with the model or training data early on

Prediction bias is the difference between the mean of a model’s predictions and the mean of ground-truth labels in the data.

Prediction bias can be caused by

  • Biases or [[Noise]] in the data, including biased sampling for the training set
  • Too-strong [[Data Science/Regularization]], meaning that the model was oversimplified and lost some necessary complexity
  • Bugs in the model training pipeline
  • The set of features provided to the model being insufficient for the task