Google Cloud Professional Data Engineer — Question 66
You are building a model to predict whether or not it will rain on a given day. You have thousands of input features and want to see if you can improve training speed by removing some features while having a minimum effect on model accuracy. What can you do?
Answer options
- A. Eliminate features that are highly correlated to the output labels.
- B. Combine highly co-dependent features into one representative feature.
- C. Instead of feeding in each feature individually, average their values in batches of 3.
- D. Remove the features that have null values for more than 50% of the training records.
Correct answer: B
Explanation
The correct answer is B because combining highly co-dependent features into a single feature can reduce dimensionality without losing significant information, thus improving training speed. Option A is incorrect as eliminating correlated features may lose important information. Option C does not effectively address feature reduction, and option D may remove valuable features that could contribute to model accuracy.