AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 4
An ML engineer trained an ML model on Amazon SageMaker to detect automobile accidents from dosed-circuit TV footage. The ML engineer used SageMaker Data Wrangler to create a training dataset of images of accidents and non-accidents.
The model performed well during training and validation. However, the model is underperforming in production because of variations in the quality of the images from various cameras.
Which solution will improve the model's accuracy in the LEAST amount of time?
Answer options
- A. Collect more images from all the cameras. Use Data Wrangler to prepare a new training dataset.
- B. Recreate the training dataset by using the Data Wrangler corrupt image transform. Specify the impulse noise option.
- C. Recreate the training dataset by using the Data Wrangler enhance image contrast transform. Specify the Gamma contrast option.
- D. Recreate the training dataset by using the Data Wrangler resize image transform. Crop all images to the same size.
Correct answer: B
Explanation
Option B is correct because using the corrupt image transform with impulse noise helps the model generalize better to variations in real-world image quality. The other options involve collecting more data or adjusting image contrast or size, which may take more time and may not directly address the issue of image quality variation.