AWS Certified Big Data – Specialty — Question 17

A new algorithm has been written in Python to identify SPAM e-mails. The algorithm analyzes the free text contained within a sample set of 1 million e-mails stored on Amazon S3. The algorithm must be scaled across a production dataset of 5 PB, which also resides in Amazon S3 storage.
Which AWS service strategy is best for this use case?

Answer options

Correct answer: C

Explanation

The correct answer is C because Amazon Elasticsearch Service is optimized for searching and analyzing large volumes of text data, making it ideal for this use case. Options A and D do not utilize the scalable capabilities needed for the large dataset effectively, while option B, although applicable for processing, may not provide the specialized text analysis features that Elasticsearch offers.