Google Cloud Professional Machine Learning Engineer — Question 2
You were asked to investigate failures of a production line component based on sensor readings. After receiving the dataset, you discover that less than 1% of the readings are positive examples representing failure incidents. You have tried to train several classification models, but none of them converge. How should you resolve the class imbalance problem?
Answer options
- A. Use the class distribution to generate 10% positive examples.
- B. Use a convolutional neural network with max pooling and softmax activation.
- C. Downsample the data with upweighting to create a sample with 10% positive examples.
- D. Remove negative examples until the numbers of positive and negative examples are equal.
Correct answer: C
Explanation
The correct answer is C because downsampling combined with upweighting can effectively adjust the class distribution to ensure that positive examples are more represented during model training. Option A does not address how to generate the positive examples correctly, while B focuses on a specific model architecture that may not solve the imbalance. Option D might lead to losing valuable information by removing too many negative examples.