AWS Certified Machine Learning – Specialty — Question 166
A data scientist is reviewing customer comments about a company's products. The data scientist needs to present an initial exploratory analysis by using charts and a word cloud. The data scientist must use feature engineering techniques to prepare this analysis before starting a natural language processing (NLP) model.
Which combination of feature engineering techniques should the data scientist use to meet these requirements? (Choose two.)
Answer options
- A. Named entity recognition
- B. Coreference
- C. Stemming
- D. Term frequency-inverse document frequency (TF-IDF)
- E. Sentiment analysis
Correct answer: C, D
Explanation
Stemming (C) reduces words to their base or root form, which is useful for consolidating similar terms in the analysis, while Term frequency-inverse document frequency (TF-IDF) (D) quantifies the importance of words in the context of documents. The other options, such as Named entity recognition (A), Coreference (B), and Sentiment analysis (E), are valuable but do not directly aid in the feature engineering needed for the exploratory analysis phase.