AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 132

A company is developing a new ML model that uses the XGBoost algorithm. The company will train the model on data that is stored in an Amazon S3 bucket. The data is in a nested JSON format.

An ML engineer needs to convert the JSON files into a tabular format.

Which solution will meet this requirement with the LEAST operational overhead?

Answer options

Correct answer: A

Explanation

Option A is correct because AWS Glue provides a serverless ETL service that simplifies the process of transforming data with minimal overhead. The other options require more manual intervention and maintenance, such as writing custom code (B) or managing Lambda functions and their invocations (C), which can increase operational complexity compared to using Glue. Option D, while useful, involves setting up an Athena database which may not be as straightforward as using Glue for this specific transformation task.