Google Cloud Professional Machine Learning Engineer — Question 128

You are building a linear regression model on BigQuery ML to predict a customer’s likelihood of purchasing your company’s products. Your model uses a city name variable as a key predictive component. In order to train and serve the model, your data must be organized in columns. You want to prepare your data using the least amount of coding while maintaining the predictable variables. What should you do?

Answer options

Correct answer: D

Explanation

The correct answer, D, is appropriate because one-hot encoding will effectively transform categorical city data into a binary format suitable for the regression model. This method maintains all predictive variables while requiring minimal coding. Options A and C involve additional complexity with TensorFlow and regional labeling, which are unnecessary for this task, and option B removes critical predictive information by omitting the city column altogether.