AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 56
Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.
The ML engineer needs to use an Amazon SageMaker built-in algorithm to train the model.
Which algorithm should the ML engineer use to meet this requirement?
Answer options
- A. LightGBM
- B. Linear learner
- C. К-means clustering
- D. Neural Topic Model (NTM)
Correct answer: A
Explanation
The correct answer is LightGBM, as it is well-suited for handling large datasets with class imbalance and can capture complex patterns through its gradient boosting framework. Linear learner is not optimal for imbalanced datasets, К-means clustering is used for unsupervised learning and does not apply to this scenario, and Neural Topic Model (NTM) is not designed for fraud detection tasks.