Google Cloud Professional Machine Learning Engineer — Question 84
You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company’s logo. In the dataset, 96% of examples don’t have the logo, so the dataset is very skewed. Which metrics would give you the most confidence in your model?
Answer options
- A. F-score where recall is weighed more than precision
- B. RMSE
- C. F1 score
- D. F-score where precision is weighed more than recall
Correct answer: A
Explanation
The correct answer is A because when dealing with a highly imbalanced dataset, prioritizing recall helps capture more true positive instances of the logo, thus improving the model's effectiveness in identifying the minority class. Options B (RMSE) is not suitable for classification tasks, while C (F1 score) and D (F-score where precision is weighed more) do not adequately address the skewness in the dataset.