AWS Certified Machine Learning – Specialty — Question 295

A data scientist is building a linear regression model. The scientist inspects the dataset and notices that the mode of the distribution is lower than the median, and the median is lower than the mean.

Which data transformation will give the data scientist the ability to apply a linear regression model?

Answer options

Correct answer: B

Explanation

The relationship where the mode is less than the median, and the median is less than the mean, indicates that the data is positively (right) skewed. A logarithmic transformation is the standard approach to reduce right skewness, helping to normalize the distribution to meet the assumptions of linear regression. In contrast, an exponential transformation would exacerbate the right skew, while polynomial and sinusoidal transformations are not designed to resolve this skewness.