AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 71
A company wants to develop an ML model by using tabular data from its customers. The data contains meaningful ordered features with sensitive information that should not be discarded. An ML engineer must ensure that the sensitive data is masked before another team starts to build the model.
Which solution will meet these requirements?
Answer options
- A. Use Amazon Made to categorize the sensitive data.
- B. Prepare the data by using AWS Glue DataBrew.
- C. Run an AWS Batch job to change the sensitive data to random values.
- D. Run an Amazon EMR job to change the sensitive data to random values.
Correct answer: B
Explanation
The correct answer is B because AWS Glue DataBrew is specifically designed for data preparation and can effectively mask sensitive information while maintaining the structure of the data. Options A, C, and D do not directly address the need for masking sensitive data in a manner that preserves the data's usability for machine learning purposes.