AWS Certified Machine Learning – Specialty — Question 310
A machine learning (ML) specialist needs to solve a binary classification problem for a marketing dataset. The ML specialist must maximize the Area Under the ROC Curve (AUC) of the algorithm by training an XGBoost algorithm. The ML specialist must find values for the eta, alpha, min_child_weight, and max_depth hyperparameters that will generate the most accurate model.
Which approach will meet these requirements with the LEAST operational overhead?
Answer options
- A. Use a bootstrap script to install scikit-learn on an Amazon EMR cluster. Deploy the EMR cluster. Apply k-fold cross-validation methods to the algorithm.
- B. Deploy Amazon SageMaker prebuilt Docker images that have scikit-learn installed. Apply k-fold cross-validation methods to the algorithm.
- C. Use Amazon SageMaker automatic model tuning (AMT). Specify a range of values for each hyperparameter.
- D. Subscribe to an AUC algorithm that is on AWS Marketplace. Specify a range of values for each hyperparameter.
Correct answer: C
Explanation
Amazon SageMaker automatic model tuning (AMT) manages the hyperparameter optimization process automatically, requiring the least operational effort to find the ideal values for eta, alpha, min_child_weight, and max_depth. Setting up custom EMR clusters or writing manual cross-validation loops in custom containers significantly increases operational overhead. Using AWS Marketplace is unnecessary because SageMaker provides a built-in XGBoost algorithm that fully supports AMT out of the box.