AWS Certified Solutions Architect – Associate (SAA-C02) — Question 374
A media company has an application that tracks user clicks on its websites and performs analytics to provide near-real time recommendations. The application has a Heel of Amazon EC2 instances that receive data from the websites and send the data to an Amazon RDS DB instance. Another fleet of EC2 instances hosts the portion of the application that is continuously checking changes in the database and executing SQL queries to provide recommendations. Management has requested a redesign to decouple the infrastructure. The solution must ensure that data analysts are writing SQL to analyze the data only No data can the lost during the deployment.
What should a solutions architect recommend?
Answer options
- A. Use Amazon Kinesis Data Streams to capture the data from the websites Kinesis Data Firehose to persist the data on Amazon S3, and Amazon Athena to query the data.
- B. Use Amazon Kinesis Data Streams to capture the data from the websites. Kinesis Data Analytics to query the data, and Kinesis Data Firehose to persist the data on Amazon S3.
- C. Use Amazon Simple Queue Service (Amazon SQS) to capture the data from the websites, keep the fleet of EC2 instances, and change to a bigger instance type in the Auto Scaling group configuration.
- D. Use Amazon Simple Notification Service (Amazon SNS) to receive data from the websites and proxy the messages to AWS Lambda functions that execute the queries and persist the data. Change Amazon RDS to Amazon Aurora Serverless to persist the data.
Correct answer: B
Explanation
Option B is correct because Amazon Kinesis Data Streams reliably captures clickstream data, while Kinesis Data Analytics enables analysts to execute SQL queries on the streaming data in real-time to generate recommendations. Kinesis Data Firehose ensures that the stream data is continuously persisted to Amazon S3 without any data loss. Option A is incorrect because querying data in S3 via Athena is designed for ad-hoc analysis of data at rest rather than near-real-time streaming analytics, and Options C and D do not provide a decoupled, SQL-focused analytics solution for the analysts.