You are developing an ML model using a dataset with categorical input variables. You have…

Question

You are developing an ML model using a dataset with categorical input variables. You have randomly split half of the data into training and test sets. After applying one-hot encoding on the categorical variables in the training set, you discover that one categorical variable is missing from the test set. What should you do?

Accepted Answer

Correct answer: C. C. Apply one-hot encoding on the categorical variables in the test data — The correct action is to apply one-hot encoding on the test data because this ensures that the test set has the same categorical variable structure as the training set. Options A and D do not directly address the mismatch in encoding, and option B alters the data split without solving the encoding issue.

Google Cloud Professional Machine Learning Engineer — Question 333

Answer options

Correct answer: C

Explanation