Databricks Certified Generative AI Engineer Associate — Question 61
A Generative AI Engineer is deciding between using LSH (Locality Sensitive Hashing) and HNSW (Hierarchical Navigable Small World) for indexing their vector database. Their top priority is semantic accuracy.
Which approach should the Generative AI Engineer use to evaluate these two techniques?
Answer options
- A. Compare the cosine similarities of the embeddings of returned results against those of a representative sample of test inputs
- B. Compare the Bilingual Evaluation Understudy (BLEU) scores of returned results for a representative sample of test inputs
- C. Compare the Recall-Oriented-Understudy for Gisting Evaluation (ROUGE) scores of returned results for a representative sample of test inputs
- D. Compare the Levenshtein distances of returned results against a representative sample of test inputs
Correct answer: A
Explanation
The correct answer is A because comparing cosine similarities helps evaluate how similar the vector embeddings of the returned results are to the test inputs, directly relating to semantic accuracy. Options B and C focus on different evaluation metrics that are more suited for text generation tasks rather than semantic similarity in vector space, while option D, Levenshtein distance, measures edit distance, which is not appropriate for assessing semantic accuracy in this context.