AWS Certified Data Engineer – Associate (DEA-C01) — Question 66
A data engineer must manage the ingestion of real-time streaming data into AWS. The data engineer wants to perform real-time analytics on the incoming streaming data by using time-based aggregations over a window of up to 30 minutes. The data engineer needs a solution that is highly fault tolerant.
Which solution will meet these requirements with the LEAST operational overhead?
Answer options
- A. Use an AWS Lambda function that includes both the business and the analytics logic to perform time-based aggregations over a window of up to 30 minutes for the data in Amazon Kinesis Data Streams.
- B. Use Amazon Managed Service for Apache Flink (previously known as Amazon Kinesis Data Analytics) to analyze the data that might occasionally contain duplicates by using multiple types of aggregations.
- C. Use an AWS Lambda function that includes both the business and the analytics logic to perform aggregations for a tumbling window of up to 30 minutes, based on the event timestamp.
- D. Use Amazon Managed Service for Apache Flink (previously known as Amazon Kinesis Data Analytics) to analyze the data by using multiple types of aggregations to perform time-based analytics over a window of up to 30 minutes.
Correct answer: D
Explanation
The correct answer is D because Amazon Managed Service for Apache Flink provides a robust, fault-tolerant solution for real-time analytics with minimal operational overhead. Options A and C involve AWS Lambda, which can be less efficient for complex streaming analytics tasks. Option B, while it employs Flink, does not explicitly mention the time-based window requirement, making D the most suitable choice.