AWS Certified Solutions Architect – Associate (SAA-C02) — Question 579
An online retail company needs to run near-real-time analytics on website traffic to analyze top-selling products across different locations. The product purchase data and the user location details are sent to a third-party application that runs on premises. The application processes the data and moves the data into the company's analytics engine.
The company needs to implement a cloud-based solution to make the data available for near-real-time analytics.
Which solution will meet these requirements with the LEAST operational overhead?
Answer options
- A. Use Amazon Kinesis Data Streams to ingest the data. Use AWS Lambda to transform the data. Configure Lambda to write the data to Amazon Amazon OpenSearch Service (Amazon Elasticsearch Service).
- B. Configure Amazon Kinesis Data Streams to write the data to an Amazon S3 bucket. Schedule an AWS Glue crawler job to enrich the data and update the AWS Glue Data Catalog. Use Amazon Athena for analytics.
- C. Configure Amazon Kinesis Data Streams to write the data to an Amazon S3 bucket. Add an Apache Spark job on Amazon EMR to enrich the data in the S3 bucket and write the data to Amazon OpenSearch Service (Amazon Elasticsearch Service).
- D. Use Amazon Kinesis Data Firehose to ingest the data. Enable Kinesis Data Firehose data transformation with AWS Lambda. Configure Kinesis Data Firehose to write the data to Amazon OpenSearch Service (Amazon Elasticsearch Service).
Correct answer: A
Explanation
Option A provides a highly efficient, serverless pipeline using Amazon Kinesis Data Streams and AWS Lambda to process and index data into Amazon OpenSearch Service with sub-second latency, ensuring near-real-time analytics capabilities. Options B and C introduce higher operational overhead and latency due to batch-oriented components like AWS Glue crawlers and Amazon EMR clusters. Using Kinesis Data Streams with Lambda allows for immediate, event-driven record transformation and ingestion to meet strict near-real-time requirements with minimal infrastructure management.