AWS Certified Data Analytics – Specialty — Question 86

A company wants to collect and process events data from different departments in near-real time. Before storing the data in Amazon S3, the company needs to clean the data by standardizing the format of the address and timestamp columns. The data varies in size based on the overall load at each particular point in time. A single data record can be 100 KB-10 MB.
How should a data analytics specialist design the solution for data ingestion?

Answer options

Correct answer: C

Explanation

The correct answer is C because using Amazon Managed Streaming for Apache Kafka allows for efficient handling of varying data sizes and provides a robust way to cleanse and process data before storing it in Amazon S3. Options A and B are more suited for real-time analytics rather than batch processing, while option D does not provide the necessary data cleansing capabilities at scale.