AWS Certified Machine Learning – Specialty — Question 345

An ecommerce company sends a weekly email newsletter to all of its customers. Management has hired a team of writers to create additional targeted content. A data scientist needs to identify five customer segments based on age, income, and location. The customers' current segmentation is unknown. The data scientist previously built an XGBoost model to predict the likelihood of a customer responding to an email based on age, income, and location.
Why does the XGBoost model NOT meet the current requirements, and how can this be fixed?

Answer options

Correct answer: D

Explanation

Finding customer segments when the groupings are unknown is an unsupervised clustering task. XGBoost and k-Nearest-Neighbors (kNN) are supervised learning algorithms that require labeled target data, which does not exist in this scenario. Training a k-means clustering model with K = 5 is the correct unsupervised approach to group the unlabeled data into five distinct segments.