AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 182

An ML engineer is building a model to predict house and apartment prices. The model uses three features: Square Meters, Price, and Age of Building. The dataset has 10,000 data rows. The data includes data points for one large mansion and one extremely small apartment.

The ML engineer must perform preprocessing on the dataset to ensure that the model produces accurate predictions for the typical house or apartment.

Which solution will meet these requirements?

Answer options

Correct answer: A

Explanation

The correct answer is A because removing outliers helps the model focus on typical data, thus improving prediction accuracy. Performing a log transformation on the Square Meters variable addresses issues with skewness in the data. Options B, C, and D either retain outliers, which can skew predictions, or apply inappropriate transformations that do not enhance the model's performance.