AWS Certified Big Data – Specialty — Question 16

A media advertising company handles a large number of real-time messages sourced from over 200 websites.
The companys data engineer needs to collect and process records in real time for analysis using Spark
Streaming on Amazon Elastic MapReduce (EMR). The data engineer needs to fulfill a corporate mandate to keep ALL raw messages as they are received as a top priority.
Which Amazon Kinesis configuration meets these requirements?

Answer options

Correct answer: C

Explanation

Option C is correct because it ensures that all raw messages are captured in Kinesis Firehose, which is backed by S3, while AWS Lambda facilitates the transfer of messages to Streams for processing with Spark Streaming. Options A and B do not guarantee that all raw messages are stored as they arrive, and option D, while it processes the messages correctly, does not utilize AWS Lambda for efficient handling of the data flow.