AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 20

A company has a large, unstructured dataset. The dataset includes many duplicate records across several key attributes.
Which solution on AWS will detect duplicates in the dataset with the LEAST code development?

Answer options

Correct answer: D

Explanation

The correct answer, D, is appropriate because AWS Glue FindMatches is specifically designed for identifying duplicates with minimal coding. The other options, while capable of addressing the issue, require more complex setups or custom development, making them less efficient for this specific need.