AWS Certified Machine Learning – Specialty — Question 206
A company is building a predictive maintenance model for its warehouse equipment. The model must predict the probability of failure of all machines in the warehouse. The company has collected 10,000 event samples within 3 months. The event samples include 100 failure cases that are evenly distributed across 50 different machine types.
How should the company prepare the data for the model to improve the model's accuracy?
Answer options
- A. Adjust the class weight to account for each machine type.
- B. Oversample the failure cases by using the Synthetic Minority Oversampling Technique (SMOTE).
- C. Undersample the non-failure events. Stratify the non-failure events by machine type.
- D. Undersample the non-failure events by using the Synthetic Minority Oversampling Technique (SMOTE).
Correct answer: B
Explanation
The correct answer is B because using SMOTE allows the company to create synthetic samples of the minority class (failure cases), which helps balance the dataset and improve the predictive accuracy of the model. Option A does not directly address the imbalance between failure and non-failure cases. Option C could lead to loss of important data, while D incorrectly suggests using SMOTE for undersampling, which is not its intended purpose.