Google Cloud Professional Machine Learning Engineer — Question 59
While conducting an exploratory analysis of a dataset, you discover that categorical feature A has substantial predictive power, but it is sometimes missing. What should you do?
Answer options
- A. Drop feature A if more than 15% of values are missing. Otherwise, use feature A as-is.
- B. Compute the mode of feature A and then use it to replace the missing values in feature A.
- C. Replace the missing values with the values of the feature with the highest Pearson correlation with feature A.
- D. Add an additional class to categorical feature A for missing values. Create a new binary feature that indicates whether feature A is missing.
Correct answer: D
Explanation
The correct answer is D because adding an additional class for missing values allows you to retain all available data and provides a clear indication of which entries were originally missing. Option A may lead to loss of valuable predictive information, while B and C do not account for the nature of the missing data and could introduce bias by imputation.