AWS Certified Machine Learning – Specialty — Question 324

A company wants to detect credit card fraud. The company has observed that an average of 2% of credit card transactions are fraudulent. A data scientist trains a classifier on a year's worth of credit card transaction data. The classifier needs to identify the fraudulent transactions. The company wants to accurately capture as many fraudulent transactions as possible.

Which metrics should the data scientist use to optimize the classifier? (Choose two.)

Answer options

Correct answer: D, E

Explanation

With a highly imbalanced dataset (2% fraud), Accuracy is misleading because a default model predicting 'no fraud' would achieve 98% accuracy. To ensure as many fraudulent transactions are captured as possible, the True positive rate (Recall) must be maximized. Additionally, the F1 score is critical as it provides a harmonic mean of precision and recall, helping optimize the model's performance on the minority class without generating excessive false positives.