AWS Certified Data Engineer – Associate (DEA-C01) — Question 249
A manufacturing company wants to collect data from sensors. A data engineer needs to implement a solution that ingests sensor data in near real time.
The solution must store the data to a persistent data store. The solution must store the data in nested JSON format. The company must have the ability to query from the data store with a latency of less than 10 milliseconds.
Which solution will meet these requirements with the LEAST operational overhead?
Answer options
- A. Use a self-hosted Apache Kafka cluster to capture the sensor data. Store the data in Amazon S3 for querying.
- B. Use AWS Lambda to process the sensor data. Store the data in Amazon S3 for querying.
- C. Use Amazon Kinesis Data Streams to capture the sensor data. Store the data in Amazon DynamoDB for querying.
- D. Use Amazon Simple Queue Service (Amazon SQS) to buffer incoming sensor data. Use AWS Glue to store the data in Amazon RDS for querying.
Correct answer: C
Explanation
Option C is correct because Amazon Kinesis Data Streams can handle real-time data ingestion effectively, and storing data in Amazon DynamoDB allows for low-latency queries, meeting the under 10 milliseconds requirement. The other options either do not provide the necessary performance or involve higher operational overhead, such as managing a self-hosted solution or using services that do not meet the query latency requirement.