AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 202
An ML engineer is developing a linear regression ML model. The model shows high accuracy on the training dataset but performs poorly on unseen new data.
Which action should the ML engineer take to address this issue?
Answer options
- A. Increase the complexity of the model to capture more patterns in the training data. Use Amazon SageMaker Debugger to monitor for convergence issues.
- B. Apply ML techniques such as cross-validation and regularization. Use Amazon SageMaker Experiments to track and compare different model versions and their performance metrics.
- C. Directly deploy the model into production. Use Amazon SageMaker Clarify to interpret model outputs on new data. Adjust the model based on these insights.
- D. Increase the size of the training dataset without adjusting the size of the model. Retrain the model on the new data. Generate a confusion matrix to analyze the results.
Correct answer: B
Explanation
The correct answer is B because applying cross-validation and regularization helps to prevent overfitting, which is likely the issue here. Increasing model complexity (A) may worsen overfitting, while deploying without adjustments (C) and merely increasing dataset size (D) without addressing model performance won't solve the underlying problem.