AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 73
An ML engineer normalized training data by using min-max normalization in AWS Glue DataBrew. The ML engineer must normalize the production inference data in the same way as the training data before passing the production inference data to the model for predictions.
Which solution will meet this requirement?
Answer options
- A. Apply statistics from a well-known dataset to normalize the production samples.
- B. Keep the min-max normalization statistics from the training set. Use these values to normalize the production samples.
- C. Calculate a new set of min-max normalization statistics from a batch of production samples. Use these values to normalize all the production samples.
- D. Calculate a new set of min-max normalization statistics from each production sample. Use these values to normalize all the production samples.
Correct answer: B
Explanation
The correct answer is B because it ensures consistency by applying the same normalization parameters used in the training phase to the production data. Option A is incorrect as it does not use the training data's statistics, leading to potential discrepancies. Option C suggests recalculating statistics from production samples, which could differ from training, and option D introduces variability by normalizing each sample individually, which is not aligned with the training process.