AWS Certified Data Engineer – Associate (DEA-C01) — Question 41

A company needs to set up a data catalog and metadata management for data sources that run in the AWS Cloud. The company will use the data catalog to maintain the metadata of all the objects that are in a set of data stores. The data stores include structured sources such as Amazon RDS and Amazon Redshift. The data stores also include semistructured sources such as JSON files and .xml files that are stored in Amazon S3.
The company needs a solution that will update the data catalog on a regular basis. The solution also must detect changes to the source metadata.
Which solution will meet these requirements with the LEAST operational overhead?

Answer options

Correct answer: B

Explanation

The correct answer is B because the AWS Glue Data Catalog is specifically designed for managing metadata in a serverless manner, allowing for easy integration with various data sources through crawlers that automate the updating process. Options A and C require manual management of Lambda functions, leading to higher operational overhead. Option D, while using the Glue Data Catalog, suggests a more complex setup by manually extracting schemas, which is unnecessary when the crawlers can handle this automatically.