EMC Proven Professional – Data Science and Big Data Analytics — Question 37

You have used k-means clustering to classify behavior of 100, 000 customers for a retail store. You decide to use household income, age, gender and yearly purchase amount as measures. You have chosen to use 8 clusters and notice that 2 clusters only have 3 customers assigned. What should you do?

Answer options

Correct answer: A

Explanation

The correct answer is A, as having clusters with very few customers suggests that the chosen number of clusters may be too high, leading to ineffective grouping. Increasing the number of clusters (B) would likely exacerbate the issue, while decreasing the number of measures (C) or identifying additional measures (D) doesn't directly address the problem of sparsely populated clusters.