AWS Certified Machine Learning – Specialty — Question 327

A company maintains a 2 TB dataset that contains information about customer behaviors. The company stores the dataset in Amazon S3. The company stores a trained model container in Amazon Elastic Container Registry (Amazon ECR).

A machine learning (ML) specialist needs to score a batch model for the dataset to predict customer behavior. The ML specialist must select a scalable approach to score the model.

Which solution will meet these requirements MOST cost-effectively?

Answer options

Correct answer: B

Explanation

Using AWS Batch with Amazon EC2 Spot Instances provides the most cost-effective compute option for fault-tolerant batch processing workloads compared to Reserved Instances. Amazon FSx for Lustre is ideal for this scenario because it integrates natively with Amazon S3, allowing fast, high-throughput access to the 2 TB dataset. Using Amazon SageMaker notebooks for large-scale batch scoring is inappropriate and less scalable than utilizing a dedicated batch processing service like AWS Batch.