AWS Certified Solutions Architect – Professional (SAP-C02) — Question 270

A company is developing a gene reporting device that will collect genomic information to assist researchers with collecting large samples of data from a diverse population. The device will push 8 KB of genomic data every second to a data platform that will need to process and analyze the data and provide information back to researchers. The data platform must meet the following requirements:

• Provide near-real-time analytics of the inbound genomic data
• Ensure the data is flexible, parallel, and durable
• Deliver results of processing to a data warehouse

Which strategy should a solutions architect use to meet these requirements?

Answer options

Correct answer: B

Explanation

Amazon Kinesis Data Streams is designed for ingestion of rapid, continuous data streams, offering the durability, parallelism, and flexibility required for near-real-time processing. Processing this stream with Kinesis client libraries and utilizing Amazon EMR to load the data into Amazon Redshift perfectly fulfills the data warehouse requirement. Other options are incorrect because Amazon RDS is not a data warehouse, and S3 or SQS-based ingestion patterns do not natively support the parallel, real-time analytics requirements as effectively as Kinesis Data Streams.