AWS Certified Machine Learning – Specialty — Question 224
A company has hired a data scientist to create a loan risk model. The dataset contains loan amounts and variables such as loan type, region, and other demographic variables. The data scientist wants to use Amazon SageMaker to test bias regarding the loan amount distribution with respect to some of these categorical variables.
Which pretraining bias metrics should the data scientist use to check the bias distribution? (Choose three.)
Answer options
- A. Class imbalance
- B. Conditional demographic disparity
- C. Difference in proportions of labels
- D. Jensen-Shannon divergence
- E. Kullback-Leibler divergence
- F. Total variation distance
Correct answer: D, E, F
Explanation
The correct metrics to evaluate bias distribution are Jensen-Shannon divergence, Kullback-Leibler divergence, and Total variation distance, as they quantify differences in probability distributions. The other options, while related to fairness and disparity, do not specifically measure the distributional bias in the same way as the selected metrics.