AWS Certified Machine Learning – Specialty — Question 275

A data engineer is evaluating customer data in Amazon SageMaker Data Wrangler. The data engineer will use the customer data to create a new model to predict customer behavior.

The engineer needs to increase the model performance by checking for multicollinearity in the dataset.

Which steps can the data engineer take to accomplish this with the LEAST operational effort? (Choose two.)

Answer options

Correct answer: B, E

Explanation

SageMaker Data Wrangler's diagnostic visualization supports PCA/SVD for singular value calculation and LASSO for plotting coefficient values, both of which are direct methods to detect multicollinearity with minimal effort. One-hot encoding and Min Max scaling are feature engineering transformations, not diagnostic tools for multicollinearity. The Quick Model visualization is designed to estimate feature importance and model viability, rather than specifically diagnosing multicollinearity.