SAS Statistical Business Analysis Using SAS 9: Regression and Modeling — Question 27

When mean imputation is performed on data after the data is partitioned for honest assessment, what is the most appropriate method for handling the mean imputation?

Answer options

Correct answer: B

Explanation

The correct answer is B because mean imputation should be based on the training data to avoid data leakage, ensuring that the validation and test sets remain unbiased. Options A and C incorrectly use means from validation and test sets, which can lead to inflated performance metrics. Option D is not suitable since it suggests using means within the same partition, which does not address the imputation across different sets.