A media advertising company handles a large number of real-time messages sourced from ove…

Question

A media advertising company handles a large number of real-time messages sourced from over 200 websites.
The companys data engineer needs to collect and process records in real time for analysis using Spark
Streaming on Amazon Elastic MapReduce (EMR). The data engineer needs to fulfill a corporate mandate to keep ALL raw messages as they are received as a top priority.
Which Amazon Kinesis configuration meets these requirements?

Accepted Answer

Correct answer: C. C. Publish messages to Amazon Kinesis Firehose backed by Amazon Simple Storage Service (S3). Use AWS Lambda to pull messages from Firehose to Streams for processing with Spark Streaming. — Option C is correct because it ensures that all raw messages are captured in Kinesis Firehose, which is backed by S3, while AWS Lambda facilitates the transfer of messages to Streams for processing with Spark Streaming. Options A and B do not guarantee that all raw messages are stored as they arrive, and option D, while it processes the messages correctly, does not utilize AWS Lambda for efficient handling of the data flow.

AWS Certified Big Data – Specialty — Question 16

Answer options

Correct answer: C

Explanation