Databricks Certified Machine Learning Associate — Question 37

In which of the following situations is it preferable to impute missing feature values with their median value over the mean value?

Answer options

Correct answer: C

Explanation

The median is less sensitive to extreme outliers compared to the mean, making it a better choice when there are many outliers present in the data. Using the mean in such cases could skew the imputed values and misrepresent the data. The other options do not provide relevant conditions for preferring the median over the mean.