Google Cloud Professional Data Engineer — Question 1
You are building a model to make clothing recommendations. You know a user's fashion preference is likely to change over time, so you build a data pipeline to stream new data back to the model as it becomes available. How should you use this data to train the model?
Answer options
- A. Continuously retrain the model on just the new data.
- B. Continuously retrain the model on a combination of existing data and the new data.
- C. Train on the existing data while using the new data as your test set.
- D. Train on the new data while using the existing data as your test set.
Correct answer: B
Explanation
The correct answer is B because continuously retraining the model on both existing and new data ensures that it stays relevant and learns from the latest trends and preferences. Option A is incorrect as it ignores valuable historical data. Options C and D are also wrong since they do not use the new data to enhance the model's learning, but rather segregate it for testing purposes.