A data scientist plans to classify the sentiment polarity of 10, 000 product reviews coll…

Question

A data scientist plans to classify the sentiment polarity of 10, 000 product reviews collected from the Internet. What is the most appropriate model to use? Suppose labeled training data is available.

Accepted Answer

Correct answer: A. A. Naïve Bayesian classifier — The Naïve Bayesian classifier is the most suitable model for sentiment analysis as it effectively handles classification tasks with labeled training data. Linear regression is not appropriate for classification problems, while logistic regression could be considered, but Naïve Bayes is often preferred for text classification. K-means clustering is an unsupervised learning method that does not apply to this scenario where labeled data is present.

EMC Proven Professional – Data Science and Big Data Analytics — Question 40

Answer options

Correct answer: A

Explanation