AWS Certified Machine Learning – Specialty — Question 302
An ecommerce company has used Amazon SageMaker to deploy a factorization machines (FM) model to suggest products for customers. The company’s data science team has developed two new models by using the TensorFlow and PyTorch deep learning frameworks. The company needs to use A/B testing to evaluate the new models against the deployed model.
The required A/B testing setup is as follows:
• Send 70% of traffic to the FM model, 15% of traffic to the TensorFlow model, and 15% of traffic to the PyTorch model.
• For customers who are from Europe, send all traffic to the TensorFlow model.
Which architecture can the company use to implement the required A/B testing setup?
Answer options
- A. Create two new SageMaker endpoints for the TensorFlow and PyTorch models in addition to the existing SageMaker endpoint. Create an Application Load Balancer. Create a target group for each endpoint. Configure listener rules and add weight to the target groups. To send traffic to the TensorFlow model for customers who are from Europe, create an additional listener rule to forward traffic to the TensorFlow target group.
- B. Create two production variants for the TensorFlow and PyTorch models. Create an auto scaling policy and configure the desired A/B weights to direct traffic to each production variant. Update the existing SageMaker endpoint with the auto scaling policy. To send traffic to the TensorFlow model for customers who are from Europe, set the TargetVariant header in the request to point to the variant name of the TensorFlow model.
- C. Create two new SageMaker endpoints for the TensorFlow and PyTorch models in addition to the existing SageMaker endpoint. Create a Network Load Balancer. Create a target group for each endpoint. Configure listener rules and add weight to the target groups. To send traffic to the TensorFlow model for customers who are from Europe, create an additional listener rule to forward traffic to the TensorFlow target group.
- D. Create two production variants for the TensorFlow and PyTorch models. Specify the weight for each production variant in the SageMaker endpoint configuration. Update the existing SageMaker endpoint with the new configuration. To send traffic to the TensorFlow model for customers who are from Europe, set the TargetVariant header in the request to point to the variant name of the TensorFlow model.
Correct answer: D
Explanation
Amazon SageMaker supports deploying multiple models as production variants under a single endpoint and distributing traffic among them based on custom weights defined in the endpoint configuration. To override this default weighted routing for specific subsets of traffic, such as European users, the client application can pass the 'TargetVariant' header in the InvokeEndpoint API request to target a specific model variant directly. Introducing load balancers as suggested in options A and C adds unnecessary architectural complexity, while option B incorrectly suggests using auto scaling policies to manage A/B testing traffic split weights.