An ML engineer is using an Amazon SageMaker Studio notebook to train a neural network by…

Question

An ML engineer is using an Amazon SageMaker Studio notebook to train a neural network by creating an estimator. The estimator runs a Python training script that uses Distributed Data Parallel (DDP) on a single instance that has more than one GPU. The ML engineer discovers that the training script is underutilizing GPU resources. The ML engineer must identify the point in the training script where resource utilization can be optimized. Which solution will meet this requirement?

Accepted Answer

Correct answer: B. B. Add SageMaker Profiler annotations to the training script. Run the script and generate a report from the results. — The correct answer is B because adding SageMaker Profiler annotations allows the engineer to analyze the performance bottlenecks in the training script, leading to better resource utilization insights. Option A only provides historical metrics without actionable insights, while option C focuses on logging rather than profiling the code. Option D is related to monitoring model performance rather than optimizing training resources directly.

AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 206

Answer options

Correct answer: B

Explanation