AWS Certified Machine Learning – Specialty — Question 27
An interactive online dictionary wants to add a widget that displays words used in similar contexts. A Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model powering the widget.
What should the Specialist do to meet these requirements?
Answer options
- A. Create one-hot word encoding vectors.
- B. Produce a set of synonyms for every word using Amazon Mechanical Turk.
- C. Create word embedding vectors that store edit distance with every other word.
- D. Download word embeddings pre-trained on a large corpus.
Correct answer: D
Explanation
The correct choice, D, is ideal because pre-trained word embeddings are designed to capture semantic relationships between words, making them suitable for contextual similarity tasks. Option A is less effective as one-hot encoding does not capture relationships between words. Option B focuses on synonyms rather than contextual usage, and option C is not as effective for nearest neighbor models since edit distance does not reflect semantic similarity.