AWS Certified Machine Learning – Specialty — Question 285
A data scientist obtains a tabular dataset that contains 150 correlated features with different ranges to build a regression model. The data scientist needs to achieve more efficient model training by implementing a solution that minimizes impact on the model’s performance. The data scientist decides to perform a principal component analysis (PCA) preprocessing step to reduce the number of features to a smaller set of independent features before the data scientist uses the new features in the regression model.
Which preprocessing step will meet these requirements?
Answer options
- A. Use the Amazon SageMaker built-in algorithm for PCA on the dataset to transform the data.
- B. Load the data into Amazon SageMaker Data Wrangler. Scale the data with a Min Max Scaler transformation step. Use the SageMaker built-in algorithm for PCA on the scaled dataset to transform the data.
- C. Reduce the dimensionality of the dataset by removing the features that have the highest correlation. Load the data into Amazon SageMaker Data Wrangler. Perform a Standard Scaler transformation step to scale the data. Use the SageMaker built-in algorithm for PCA on the scaled dataset to transform the data.
- D. Reduce the dimensionality of the dataset by removing the features that have the lowest correlation. Load the data into Amazon SageMaker Data Wrangler. Perform a Min Max Scaler transformation step to scale the data. Use the SageMaker built-in algorithm for PCA on the scaled dataset to transform the data.
Correct answer: B
Explanation
PCA is highly sensitive to the relative scaling of input features, so features with larger ranges must be scaled first (e.g., using a Min Max Scaler in SageMaker Data Wrangler) so they do not disproportionately dominate the principal components. Manually removing correlated features prior to PCA, as suggested in options C and D, is unnecessary and counterproductive because PCA is specifically designed to handle multicollinearity automatically. Directly applying PCA without scaling, as in option A, would lead to biased principal components due to the different feature ranges.