AWS Certified Machine Learning – Specialty — Question 153

A real-estate company is launching a new product that predicts the prices of new houses. The historical data for the properties and prices is stored in .csv format in an Amazon S3 bucket. The data has a header, some categorical fields, and some missing values. The company's data scientists have used Python with a common open-source library to fill the missing values with zeros. The data scientists have dropped all of the categorical fields and have trained a model by using the open-source linear regression algorithm with the default parameters.
The accuracy of the predictions with the current model is below 50%. The company wants to improve the model performance and launch the new product as soon as possible.
Which solution will meet these requirements with the LEAST operational overhead?

Answer options

Correct answer: D

Explanation

The correct answer is D because using SageMaker AutoML with SageMaker Autopilot automates the selection of the best model and hyperparameters, significantly reducing operational overhead. Options A, B, and C require more manual intervention in terms of feature engineering and model selection, which is not as efficient as the automated approach offered by SageMaker Autopilot.