AWS Certified Machine Learning – Specialty — Question 327
A company maintains a 2 TB dataset that contains information about customer behaviors. The company stores the dataset in Amazon S3. The company stores a trained model container in Amazon Elastic Container Registry (Amazon ECR).
A machine learning (ML) specialist needs to score a batch model for the dataset to predict customer behavior. The ML specialist must select a scalable approach to score the model.
Which solution will meet these requirements MOST cost-effectively?
Answer options
- A. Score the model by using AWS Batch managed Amazon EC2 Reserved Instances. Create an Amazon EC2 instance store volume and mount it to the Reserved Instances.
- B. Score the model by using AWS Batch managed Amazon EC2 Spot Instances. Create an Amazon FSx for Lustre volume and mount it to the Spot Instances.
- C. Score the model by using an Amazon SageMaker notebook on Amazon EC2 Reserved Instances. Create an Amazon EBS volume and mount it to the Reserved Instances.
- D. Score the model by using Amazon SageMaker notebook on Amazon EC2 Spot Instances. Create an Amazon Elastic File System (Amazon EFS) file system and mount it to the Spot Instances.
Correct answer: B
Explanation
Using AWS Batch with Amazon EC2 Spot Instances provides the most cost-effective compute option for fault-tolerant batch processing workloads compared to Reserved Instances. Amazon FSx for Lustre is ideal for this scenario because it integrates natively with Amazon S3, allowing fast, high-throughput access to the 2 TB dataset. Using Amazon SageMaker notebooks for large-scale batch scoring is inappropriate and less scalable than utilizing a dedicated batch processing service like AWS Batch.