AWS Certified Machine Learning – Specialty — Question 103

A financial company is trying to detect credit card fraud. The company observed that, on average, 2% of credit card transactions were fraudulent. A data scientist trained a classifier on a year's worth of credit card transactions data. The model needs to identify the fraudulent transactions (positives) from the regular ones
(negatives). The company's goal is to accurately capture as many positives as possible.
Which metrics should the data scientist use to optimize the model? (Choose two.)

Answer options

Correct answer: D, E

Explanation

The correct metrics for optimizing the model in this scenario are the Area under the precision-recall curve (D) and the True positive rate (E). These metrics specifically focus on the model's ability to correctly identify positive instances (fraudulent transactions), which is crucial given the low prevalence of fraud. Other options like Specificity (A), False positive rate (B), and Accuracy (C) do not effectively address the need to capture as many positives as possible in this imbalanced dataset.