AWS Certified Machine Learning – Specialty — Question 351
A company needs to develop a model that uses a machine learning (ML) model for risk analysis. An ML engineer needs to evaluate the contribution each feature of a training dataset makes to the prediction of the target variable before the ML engineer selects features.
How should the ML engineer predict the contribution of each feature?
Answer options
- A. Use the Amazon SageMaker Data Wrangler multicollinearity measurement features and the principal component analysis (PCA) algorithm to calculate the variance of the dataset along multiple directions in the feature space.
- B. Use an Amazon SageMaker Data Wrangler quick model visualization to find feature importance scores that are between 0.5 and 1.
- C. Use the Amazon SageMaker Data Wrangler bias report to identify potential biases in the data related to feature engineering.
- D. Use an Amazon SageMaker Data Wrangler data flow to create and modify a data preparation pipeline. Manually add the feature scores.
Correct answer: B
Explanation
Amazon SageMaker Data Wrangler's quick model visualization generates feature importance scores, where higher scores (typically between 0.5 and 1) indicate a significant contribution to predicting the target variable. PCA (Option A) and bias reports (Option C) do not directly measure individual feature importance for a target variable, while manually adding scores (Option D) is not a predictive method.