AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 176
A company is using Amazon SageMaker AI to deploy a new recommendation model for its ecommerce website. The model must use data from all client website interactions as input.
Traffic is variable throughout the day. The company needs to create an inference endpoint for the model.
Which type of inference endpoint will meet these requirements MOST cost-effectively?
Answer options
- A. Batch transform inference endpoint
- B. Asynchronous inference endpoint
- C. Real-time inference endpoint
- D. Serverless inference endpoint
Correct answer: D
Explanation
The Serverless inference endpoint is the most cost-effective choice as it automatically scales based on demand, meaning the company only pays for the actual compute resources used during variable traffic. The other options, like Batch transform and Real-time inference endpoints, involve fixed costs that may not align with fluctuating traffic patterns, while Asynchronous inference is better suited for processing large volumes of requests rather than real-time interactions.