AWS Certified Solutions Architect – Associate (SAA-C03) — Question 764
A company has an Amazon S3 data lake. The company needs a solution that transforms the data from the data lake and loads the data into a data warehouse every day. The data warehouse must have massively parallel processing (MPP) capabilities.
Data analysts then need to create and train machine learning (ML) models by using SQL commands on the data. The solution must use serverless AWS services wherever possible.
Which solution will meet these requirements?
Answer options
- A. Run a daily Amazon EMR job to transform the data and load the data into Amazon Redshift. Use Amazon Redshift ML to create and train the ML models.
- B. Run a daily Amazon EMR job to transform the data and load the data into Amazon Aurora Serverless. Use Amazon Aurora ML to create and train the ML models.
- C. Run a daily AWS Glue job to transform the data and load the data into Amazon Redshift Serverless. Use Amazon Redshift ML to create and train the ML models.
- D. Run a daily AWS Glue job to transform the data and load the data into Amazon Athena tables. Use Amazon Athena ML to create and train the ML models.
Correct answer: C
Explanation
AWS Glue is a fully serverless ETL service ideal for daily data transformation, and Amazon Redshift Serverless provides a serverless, massively parallel processing (MPP) data warehouse solution. Amazon Redshift ML enables analysts to build and train machine learning models directly using standard SQL. Other options are incorrect because Amazon EMR is not serverless by default, Amazon Aurora Serverless is a transactional database rather than an MPP data warehouse, and Amazon Athena is a query service instead of a dedicated data warehouse.