A machine learning (ML) engineer is preparing a dataset for a classification model. The M…

Question

A machine learning (ML) engineer is preparing a dataset for a classification model. The ML engineer notices that some continuous numeric features have a significantly greater value than most other features. A business expert explains that the features are independently informative and that the dataset is representative of the target distribution. After training, the model's inferences accuracy is lower than expected. Which preprocessing technique will result in the GREATEST increase of the model's inference accuracy?

Accepted Answer

Correct answer: A. A. Normalize the problematic features. — Normalizing the continuous features scales them to a consistent range, which prevents features with larger magnitudes from disproportionately dominating the model's weight updates during training. Because the business expert confirmed these features are informative and representative, dropping them would discard valuable signal, whereas bootstrapping or extrapolation would not resolve the underlying feature scale disparity.

AWS Certified Machine Learning – Specialty — Question 334

Answer options

Correct answer: A

Explanation