An ecommerce company wants to update a production real-time machine learning (ML) recomme…

Question

An ecommerce company wants to update a production real-time machine learning (ML) recommendation engine API that uses Amazon SageMaker. The company wants to release a new model but does not want to make changes to applications that rely on the API. The company also wants to evaluate the performance of the new model in production traffic before the company fully rolls out the new model to all users. Which solution will meet these requirements with the LEAST operational overhead?

Accepted Answer

Correct answer: B. B. Modify the existing endpoint to use SageMaker production variants to distribute traffic between the old model and the new model. — Using Amazon SageMaker production variants allows you to host multiple models on a single endpoint and distribute traffic among them by configuring weights, which requires no changes to client applications and minimizes operational overhead. Options A and D introduce unnecessary complexity by requiring additional load balancers (ALB/NLB) and managing multiple endpoints. Option C is incorrect because SageMaker batch transform is designed for non-real-time, offline predictions on large datasets rather than real-time API traffic splitting.

AWS Certified Machine Learning – Specialty — Question 319

Answer options

Correct answer: B

Explanation