binning

For example, consider a feature named X whose lowest value is 15 and highest value is 425. Using binning, you could represent X with the following five bins:

Binning is a good alternative to [[Normalization|scaling]] or [[Clipping]] when either of the following conditions is met:

Binning can feel counterintuitive, given that the model in the previous example treats the values 37 and 115 identically. But when a feature appears more clumpy than linear, binning is a much better way to represent the data.

Quantile bucketing creates bucketing boundaries such that the number of examples in each bucket is exactly or nearly equal.