A machine learning (ML) specialist wants to create a data preparation job that uses a PyS…

Question

A machine learning (ML) specialist wants to create a data preparation job that uses a PySpark script with complex window aggregation operations to create data for training and testing. The ML specialist needs to evaluate the impact of the number of features and the sample count on model performance.
Which approach should the ML specialist use to determine the ideal data transformations for the model?

Accepted Answer

Correct answer: D. D. Add an Amazon SageMaker Experiments tracker to the script to capture key parameters. Run the script as a SageMaker processing job. — The correct answer is D because using an Amazon SageMaker Experiments tracker allows the ML specialist to capture key parameters and assess the effects of different data transformations on model performance effectively. Options A and B incorrectly suggest capturing metrics instead of parameters, which are crucial for understanding the relationship between features and performance. Option C uses a Debugger hook, which is less suitable for tracking experiments compared to the Experiments tracker.

AWS Certified Machine Learning – Specialty — Question 144

Answer options

Correct answer: D

Explanation