AWS Certified Machine Learning – Specialty — Question 83

A company wants to predict the sale prices of houses based on available historical sales data. The target variable in the company's dataset is the sale price. The features include parameters such as the lot size, living area measurements, non-living area measurements, number of bedrooms, number of bathrooms, year built, and postal code. The company wants to use multi-variable linear regression to predict house sale prices.
Which step should a machine learning specialist take to remove features that are irrelevant for the analysis and reduce the model's complexity?

Answer options

Correct answer: D

Explanation

The correct answer is D because running a correlation check against the target variable helps identify which features have a significant relationship with the sale price, allowing for the removal of those that do not contribute meaningfully to the prediction. Options A and B focus on variance without considering the target variable, which may not effectively reduce complexity. Option C looks at mutual correlation among features but does not assess their relevance to the target variable.