AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 134

An ML engineer needs to deploy a trained model that is based on a genetic algorithm. The algorithm solves a complex problem and can take several minutes to generate predictions.

When the model is deployed, the model needs to access large amounts of data to process requests. The requests can involve as much as 100 MB of data.

Which deployment solution will meet these requirements with the LEAST operational overhead?

Answer options

Correct answer: C

Explanation

The correct answer is C because an Amazon SageMaker Asynchronous Inference endpoint is designed for handling large payloads and can process requests that take longer to complete without tying up resources, making it ideal for the model's requirements. Options A and B do not offer the same level of suitability for processing large amounts of data with longer inference times, while option D introduces additional complexity with container management.