linear-scaling | TNPSC Fuhrer Notes

Linear scaling (more commonly shortened to just scaling) means converting floating-point values from their natural range into a standard range—usually 0 to 1 or -1 to +1.

Linear scaling is a good choice when all of the following conditions are met:

The lower and upper bounds of your data don’t change much over time.
The feature contains few or no outliers, and those outliers aren’t extreme.
The feature is approximately uniformly distributed across its range. That is, a histogram would show roughly even bars for most values.

Suppose human age is a feature. Linear scaling is a good normalization technique for age because:

The approximate lower and upper bounds are 0 to 100.
age contains a relatively small percentage of outliers. Only about 0.3% of the population is over 100.
Although certain ages are somewhat better represented than others, a large dataset should contain sufficient examples of all ages.

Note :

Most real-world features do not meet all of the criteria for linear scaling.
[[Z-score]] scaling is typically a better normalization choice than linear scaling.