A banking company uses an application to collect large volumes of transactional data. The…

Question

A banking company uses an application to collect large volumes of transactional data. The company uses Amazon Kinesis Data Streams for real-time analytics. The company’s application uses the PutRecord action to send data to Kinesis Data Streams. A data engineer has observed network outages during certain times of day. The data engineer wants to configure exactly-once delivery for the entire processing pipeline. Which solution will meet this requirement?

Accepted Answer

Correct answer: A. A. Design the application so it can remove duplicates during processing by embedding a unique ID in each record at the source. — The correct answer is A because embedding a unique ID in each record allows the application to identify and discard duplicates, ensuring exactly-once delivery. Option B only addresses duplicate processing within the context of Apache Flink, not the entire pipeline. Option C suggests modifying the data source, which may not be feasible or effective in all scenarios. Option D proposes switching technologies, which does not guarantee exactly-once delivery and introduces unnecessary complexity.

AWS Certified Data Engineer – Associate (DEA-C01) — Question 252

Answer options

Correct answer: A

Explanation