Which of the following measures would a data scientist most likely use to calculate the s…

Question

Which of the following measures would a data scientist most likely use to calculate the similarity of two text strings?

Accepted Answer

Correct answer: B. B. Edit distance — The correct answer is B, Edit distance, which quantifies the difference between two strings by counting the minimum number of operations required to transform one string into the other. Options A (Word cloud) visualizes text data rather than measuring similarity, C (String indexing) relates to organizing text for retrieval, and D (k-nearest neighbors) is a classification method that does not specifically measure string similarity.

CompTIA DataX (DY0-001) — Question 80

Answer options

Correct answer: B

Explanation