AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 66

A company needs to host a custom ML model to perform forecast analysis. The forecast analysis will occur with predictable and sustained load during the same 2-hour period every day.
Multiple invocations during the analysis period will require quick responses. The company needs AWS to manage the underlying infrastructure and any auto scaling activities.
Which solution will meet these requirements?

Answer options

Correct answer: C

Explanation

The correct answer, C, is ideal because Amazon SageMaker Serverless Inference with provisioned concurrency allows for quick responses and automatically manages scaling based on demand. Option A is incorrect as batch processing is not suitable for real-time responses. Option B does not align with the need for rapid scaling and response times, while option D introduces unnecessary complexity with Kubernetes when a serverless solution is more appropriate.