EMC Proven Professional – Data Science and Big Data Analytics — Question 37
You have used k-means clustering to classify behavior of 100, 000 customers for a retail store. You decide to use household income, age, gender and yearly purchase amount as measures. You have chosen to use 8 clusters and notice that 2 clusters only have 3 customers assigned. What should you do?
Answer options
- A. Decrease the number of clusters
- B. Increase the number of clusters
- C. Decrease the number of measures used
- D. Identify additional measures to add to the analysis
Correct answer: A
Explanation
The correct answer is A, as having clusters with very few customers suggests that the chosen number of clusters may be too high, leading to ineffective grouping. Increasing the number of clusters (B) would likely exacerbate the issue, while decreasing the number of measures (C) or identifying additional measures (D) doesn't directly address the problem of sparsely populated clusters.