A company needs to analyze a large dataset that is stored in Amazon S3 in Apache Parquet…

Question

A company needs to analyze a large dataset that is stored in Amazon S3 in Apache Parquet format. The company wants to use one-hot encoding for some of the columns. The company needs a no-code solution to transform the data. The solution must store the transformed data back to the same S3 bucket for model training. Which solution will meet these requirements?

Accepted Answer

Correct answer: A. A. Configure an AWS Glue DataBrew project that connects to the data. Use the DataBrew interactive interface to create a recipe that performs the one-hot encoding transformation. Create a job to apply the transformation and to write the output back to an S3 bucket. — The correct answer is A because AWS Glue DataBrew provides a no-code solution that allows users to perform data transformations, including one-hot encoding, through an interactive interface. Option B requires SQL coding, while C uses a notebook that involves coding as well, and D relies on Redshift, which does not meet the no-code requirement.

AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 157

Answer options

Correct answer: A

Explanation