AWS Certified Machine Learning – Specialty — Question 367
A data scientist uses Amazon SageMaker Data Wrangler to obtain a feature summary from a dataset that the data scientist imported from Amazon S3. The data scientist notices that the prediction power for a dataset feature has a score of 1.
What is the cause of the score?
Answer options
- A. Target leakage occurred in the imported dataset.
- B. The data scientist did not fine-tune the training and validation split.
- C. The SageMaker Data Wrangler algorithm that the data scientist used did not find an optimal model fit for each feature to calculate the prediction power.
- D. The data scientist did not process the features enough to accurately calculate prediction power.
Correct answer: A
Explanation
A prediction power score of 1 in SageMaker Data Wrangler indicates that a feature is a perfect predictor of the target variable, which typically happens when target leakage occurs (meaning training data contains information about the target that wouldn't be available during inference). Options B, C, and D are incorrect because issues with validation splits, model fitting, or lack of feature preprocessing would not artificially inflate a feature's predictive power to a perfect score of 1.