Google Cloud Professional Data Engineer — Question 256

You are preparing data to serve a sales demand prediction model. The training data undergoes several pre-processing steps, including scaling numerical features and one-hot encoding categorical features. The model is deployed on Vertex AI Endpoints. You need to prevent training-serving skew and ensure accurate predictions in production. You want a solution that is easy to implement.

What should you do?

Answer options

Correct answer: B

Explanation

The correct answer is B because duplicating the pre-processing logic ensures that the data served to the model is consistent with the data used during training, which is critical for making accurate predictions. Option A introduces complexity without guaranteeing consistency, while C and D do not address the need for identical pre-processing, leading to potential skew in predictions.