CompTIA DataX (DY0-001) — Question 10
A data scientist needs to:
Build a predictive model that gives the likelihood that a car will get a flat tire.
Provide a data set of cars that had flat tires and cars that did not.
All the cars in the data set had sensors taking weekly measurements of tire pressure similar to the sensors that will be installed in the cars consumers drive. Which of the following is the most immediate data concern?
Answer options
- A. Granularity misalignment
- B. Multivariate outliers
- C. Insufficient domain expertise
- D. Lagged observations
Correct answer: D
Explanation
The correct answer is D because lagged observations can significantly impact the predictive model by introducing delays in the data that may not accurately represent current tire conditions. Granularity misalignment, multivariate outliers, and insufficient domain expertise are also concerns, but they do not have as immediate an effect on the model's accuracy as lagged observations do.