AWS Certified Solutions Architect – Associate (SAA-C03) — Question 904
A weather forecasting company collects temperature readings from various sensors on a continuous basis. An existing data ingestion process collects the readings and aggregates the readings into larger Apache Parquet files. Then the process encrypts the files by using client-side encryption with KMS managed keys (CSE-KMS). Finally, the process writes the files to an Amazon S3 bucket with separate prefixes for each calendar day.
The company wants to run occasional SQL queries on the data to take sample moving averages for a specific calendar day.
Which solution will meet these requirements MOST cost-effectively?
Answer options
- A. Configure Amazon Athena to read the encrypted files. Run SQL queries on the data directly in Amazon S3.
- B. Use Amazon S3 Select to run SQL queries on the data directly in Amazon S3.
- C. Configure Amazon Redshift to read the encrypted files. Use Redshift Spectrum and Redshift query editor v2 to run SQL queries on the data directly in Amazon S3.
- D. Configure Amazon EMR Serverless to read the encrypted files. Use Apache SparkSQL to run SQL queries on the data directly in Amazon S3.
Correct answer: A
Explanation
Amazon Athena is a serverless, pay-per-query interactive query service that natively supports reading Parquet files encrypted with CSE-KMS directly from Amazon S3, making it the most cost-effective option for occasional queries. Amazon S3 Select does not support CSE-KMS encrypted files, which rules out Option B. Amazon Redshift Spectrum and Amazon EMR Serverless are more complex and require provisioning compute infrastructure, which is not cost-effective for infrequent, occasional queries.