CompTIA DataX (DY0-001) — Question 33

A data scientist is clustering a data set but does not want to specify the number of clusters present. Which of the following algorithms should the data scientist use?

Answer options

Correct answer: A

Explanation

DBSCAN is a clustering algorithm that does not require the number of clusters to be specified in advance, making it ideal for this situation. In contrast, k-means requires the number of clusters to be predetermined, k-nearest neighbors is primarily a classification technique, and logistic regression is also not suited for clustering tasks.