AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 169

An ML engineer is collecting data to train a classification ML model by using Amazon SageMaker AI. The target column can have two possible values: Class A or Class B. The ML engineer wants to ensure that the number of samples for both Class A and Class B are balanced, without losing any existing training data. The ML engineer must test the balance of the training data.

Which solution will meet this requirement?

Answer options

Correct answer: B

Explanation

Option B is correct because it correctly identifies that SageMaker Clarify can assess class imbalance, and if the imbalance exists (value greater than 0), SMOTE can be used to balance the classes without losing data. The other options either incorrectly suggest actions for when the CI value is 0 or use the wrong techniques that may lead to data loss or imbalance.