A company is setting up a data pipeline in AWS. The pipeline extracts client data from Am…

Question

A company is setting up a data pipeline in AWS. The pipeline extracts client data from Amazon S3 buckets, performs quality checks, and transforms the data. The pipeline stores the processed data in a relational database. The company will use the processed data for future queries. Which solution will meet these requirements MOST cost-effectively?

Accepted Answer

Correct answer: A. A. Use AWS Glue ETL to extract the data from the S3 buckets and perform the transformations. Use AWS Glue Data Quality to enforce suggested quality rules. Load the data and the quality check results into an Amazon RDS for MySQL instance. — Option A is the most cost-effective solution as it directly utilizes AWS Glue ETL for extraction and transformation and integrates AWS Glue Data Quality for ensuring data integrity, while saving results in a relational database, which is efficient for future queries. The other options either involve additional services or unnecessary storage in S3, which could increase costs without providing significant benefits.

AWS Certified Data Engineer – Associate (DEA-C01) — Question 213

Answer options

Correct answer: A

Explanation