Google Cloud Professional Machine Learning Engineer — Question 73

You are working on a classification problem with time series data. After conducting just a few experiments using random cross-validation, you achieved an Area Under the Receiver Operating Characteristic Curve (AUC ROC) value of 99% on the training data. You haven’t explored using any sophisticated algorithms or spent any time on hyperparameter tuning. What should your next step be to identify and fix the problem?

Answer options

Correct answer: B

Explanation

The correct answer is B because nested cross-validation helps in identifying and mitigating data leakage, ensuring that the training process does not inadvertently use information from the test set. Option A suggests using a simpler algorithm, which may not address the issue of leakage. Option C focuses on correlation but does not provide a thorough method to handle data leakage, while D incorrectly suggests that reducing the AUC ROC score is a valid approach to overfitting.