An ML engineer wants to use a set of survey responses as training data for an ML classifi…

Question

An ML engineer wants to use a set of survey responses as training data for an ML classifier. All the survey responses are either “yes” or “no.” The ML engineer needs to convert the responses into a feature that will produce better model training results. The ML engineer must not increase the dimensionality of the dataset. Which methods will meet these requirements? (Choose two.)

Accepted Answer

Correct answer: A, B. A. Binary encoding — B. Label encoding — Binary encoding and label encoding are suitable methods for converting categorical data into numeric format without increasing dimensionality, which is essential for this scenario. One-hot encoding, on the other hand, would increase dimensionality as it creates additional binary columns for each category, while statistical imputation and tokenization do not specifically address the need to encode the binary responses.

AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 135

Answer options

Correct answer: A, B

Explanation