A Data Scientist is building a model to predict customer churn using a dataset of 100 con…

Question

A Data Scientist is building a model to predict customer churn using a dataset of 100 continuous numerical features. The Marketing team has not provided any insight about which features are relevant for churn prediction. The Marketing team wants to interpret the model and see the direct impact of relevant features on the model outcome. While training a logistic regression model, the Data Scientist observes that there is a wide gap between the training and validation set accuracy.
Which methods can the Data Scientist use to improve the model performance and satisfy the Marketing team's needs? (Choose two.)

Accepted Answer

Correct answer: A, C. A. Add L1 regularization to the classifier — C. Perform recursive feature elimination — Adding L1 regularization helps reduce overfitting by penalizing the absolute size of the coefficients, which can lead to a more generalizable model. Recursive feature elimination aids in identifying and selecting the most relevant features, improving interpretability and performance. The other options either do not address the overfitting issue effectively or do not provide the necessary insights into feature relevance.

AWS Certified Machine Learning – Specialty — Question 156

Answer options

Correct answer: A, C

Explanation