CompTIA DataX (DY0-001) — Question 33
A data scientist is clustering a data set but does not want to specify the number of clusters present. Which of the following algorithms should the data scientist use?
Answer options
- A. DBSCAN
- B. k-nearest neighbors
- C. k-means
- D. Logistic regression
Correct answer: A
Explanation
DBSCAN is a clustering algorithm that does not require the number of clusters to be specified in advance, making it ideal for this situation. In contrast, k-means requires the number of clusters to be predetermined, k-nearest neighbors is primarily a classification technique, and logistic regression is also not suited for clustering tasks.