AWS Certified Machine Learning – Specialty — Question 144
A machine learning (ML) specialist wants to create a data preparation job that uses a PySpark script with complex window aggregation operations to create data for training and testing. The ML specialist needs to evaluate the impact of the number of features and the sample count on model performance.
Which approach should the ML specialist use to determine the ideal data transformations for the model?
Answer options
- A. Add an Amazon SageMaker Debugger hook to the script to capture key metrics. Run the script as an AWS Glue job.
- B. Add an Amazon SageMaker Experiments tracker to the script to capture key metrics. Run the script as an AWS Glue job.
- C. Add an Amazon SageMaker Debugger hook to the script to capture key parameters. Run the script as a SageMaker processing job.
- D. Add an Amazon SageMaker Experiments tracker to the script to capture key parameters. Run the script as a SageMaker processing job.
Correct answer: D
Explanation
The correct answer is D because using an Amazon SageMaker Experiments tracker allows the ML specialist to capture key parameters and assess the effects of different data transformations on model performance effectively. Options A and B incorrectly suggest capturing metrics instead of parameters, which are crucial for understanding the relationship between features and performance. Option C uses a Debugger hook, which is less suitable for tracking experiments compared to the Experiments tracker.