AWS Certified Data Engineer – Associate (DEA-C01) — Question 162

A company has a data lake in Amazon S3. The company uses AWS Glue to catalog data and AWS Glue Studio to implement data extract, transform, and load (ETL) pipelines.

The company needs to ensure that data quality issues are checked every time the pipelines run. A data engineer must enhance the existing pipelines to evaluate data quality rules based on predefined thresholds.

Which solution will meet these requirements with the LEAST implementation effort?

Answer options

Correct answer: B

Explanation

The correct choice, B, is the most efficient solution as it directly integrates an Evaluate Data Quality transform which is specifically designed for this purpose, allowing for straightforward implementation using DQDL. Options A, C, and D involve additional complexity and custom development, which would require more effort and may not be as streamlined as using the built-in capabilities of AWS Glue for data quality evaluation.