AWS Certified Machine Learning – Specialty — Question 140

A manufacturing company uses machine learning (ML) models to detect quality issues. The models use images that are taken of the company's product at the end of each production step. The company has thousands of machines at the production site that generate one image per second on average.
The company ran a successful pilot with a single manufacturing machine. For the pilot, ML specialists used an industrial PC that ran AWS IoT Greengrass with a long-running AWS Lambda function that uploaded the images to Amazon S3. The uploaded images invoked a Lambda function that was written in Python to perform inference by using an Amazon SageMaker endpoint that ran a custom model. The inference results were forwarded back to a web service that was hosted at the production site to prevent faulty products from being shipped.
The company scaled the solution out to all manufacturing machines by installing similarly configured industrial PCs on each production machine. However, latency for predictions increased beyond acceptable limits. Analysis shows that the internet connection is at its capacity limit.
How can the company resolve this issue MOST cost-effectively?

Answer options

Correct answer: D

Explanation

The correct answer, D, allows the company to perform inference on-site, reducing latency by avoiding the need to send images over the internet. This edge computing approach minimizes bandwidth usage and leverages existing infrastructure effectively. Options A and C involve additional costs for Direct Connect, which does not directly address the latency issue. Option B introduces unnecessary complexity with image compression and decompression, which is not as efficient as performing inference on the edge.