Google Cloud Professional Machine Learning Engineer — Question 252

You are developing a custom TensorFlow classification model based on tabular data. Your raw data is stored in BigQuery. contains hundreds of millions of rows, and includes both categorical and numerical features. You need to use a MaxMin scaler on some numerical features, and apply a one-hot encoding to some categorical features such as SKU names. Your model will be trained over multiple epochs. You want to minimize the effort and cost of your solution. What should you do?

Answer options

Correct answer: C

Explanation

The correct answer is C because using TFX components with Dataflow optimally handles both encoding and scaling, ensuring efficient processing at scale. Exporting to TFRecords and then feeding into Vertex AI Training streamlines the workflow. The other options either involve unnecessary steps or do not leverage the full capabilities of the tools available, leading to inefficiencies.