Google Cloud Professional Machine Learning Engineer — Question 82
You work for a large social network service provider whose users post articles and discuss news. Millions of comments are posted online each day, and more than 200 human moderators constantly review comments and flag those that are inappropriate. Your team is building an ML model to help human moderators check content on the platform. The model scores each comment and flags suspicious comments to be reviewed by a human. Which metric(s) should you use to monitor the model’s performance?
Answer options
- A. Number of messages flagged by the model per minute
- B. Number of messages flagged by the model per minute confirmed as being inappropriate by humans.
- C. Precision and recall estimates based on a random sample of 0.1% of raw messages each minute sent to a human for review
- D. Precision and recall estimates based on a sample of messages flagged by the model as potentially inappropriate each minute
Correct answer: D
Explanation
Option D is the correct choice because it focuses on the precision and recall metrics specifically for messages identified by the model as potentially inappropriate, which is crucial for assessing the model's accuracy. Options A and B do not provide insights into the model's effectiveness in distinguishing appropriate from inappropriate content, while option C relies on a random sample of all messages, which is less relevant than focusing on those flagged by the model.