CompTIA DataX (DY0-001) — Question 63

A data analyst is examining the correlation matrix of a new data set to identify issues that could adversely impact model performance. Which of the following is the analyst most likely checking for?

Answer options

Correct answer: B

Explanation

The correct answer is B, as multicollinearity refers to the situation where two or more independent variables in a regression model are highly correlated, which can distort the model's performance. Undersampling and oversampling relate to sample size adjustments rather than correlation issues, and overfitting refers to a model that is too complex, capturing noise instead of the underlying trend.