AWS Certified Machine Learning – Specialty — Question 209
A data scientist at a food production company wants to use an Amazon SageMaker built-in model to classify different vegetables. The current dataset has many features. The company wants to save on memory costs when the data scientist trains and deploys the model. The company also wants to be able to find similar data points for each test data point.
Which algorithm will meet these requirements?
Answer options
- A. K-nearest neighbors (k-NN) with dimension reduction
- B. Linear learner with early stopping
- C. K-means
- D. Principal component analysis (PCA) with the algorithm mode set to random
Correct answer: A
Explanation
The K-nearest neighbors (k-NN) algorithm with dimension reduction is suitable because it allows for efficient classification while managing memory costs through reduced dimensions. The other options, such as Linear learner with early stopping and K-means, do not focus on finding similar data points for classification, and PCA is primarily a dimensionality reduction technique rather than a classification algorithm.