AWS Certified Machine Learning – Specialty — Question 201
A manufacturing company needs to identify returned smartphones that have been damaged by moisture. The company has an automated process that produces 2,000 diagnostic values for each phone. The database contains more than five million phone evaluations. The evaluation process is consistent, and there are no missing values in the data. A machine learning (ML) specialist has trained an Amazon SageMaker linear learner ML model to classify phones as moisture damaged or not moisture damaged by using all available features. The model's F1 score is 0.6.
Which changes in model training would MOST likely improve the model's F1 score? (Choose two.)
Answer options
- A. Continue to use the SageMaker linear learner algorithm. Reduce the number of features with the SageMaker principal component analysis (PCA) algorithm.
- B. Continue to use the SageMaker linear learner algorithm. Reduce the number of features with the scikit-learn multi-dimensional scaling (MDS) algorithm.
- C. Continue to use the SageMaker linear learner algorithm. Set the predictor type to regressor.
- D. Use the SageMaker k-means algorithm with k of less than 1,000 to train the model.
- E. Use the SageMaker k-nearest neighbors (k-NN) algorithm. Set a dimension reduction target of less than 1,000 to train the model.
Correct answer: A, E
Explanation
Option A is correct because reducing the number of features can help eliminate noise and improve the model's performance. Option E is also valid as the k-NN algorithm may capture the relationships between features better with dimensionality reduction. Options B and C do not directly address improving the model's F1 score effectively, while option D incorrectly suggests using k-means for classification purposes.