Google Cloud Professional Machine Learning Engineer — Question 159
You want to train an AutoML model to predict house prices by using a small public dataset stored in BigQuery. You need to prepare the data and want to use the simplest, most efficient approach. What should you do?
Answer options
- A. Write a query that preprocesses the data by using BigQuery and creates a new table. Create a Vertex AI managed dataset with the new table as the data source.
- B. Use Dataflow to preprocess the data. Write the output in TFRecord format to a Cloud Storage bucket.
- C. Write a query that preprocesses the data by using BigQuery. Export the query results as CSV files, and use those files to create a Vertex AI managed dataset.
- D. Use a Vertex AI Workbench notebook instance to preprocess the data by using the pandas library. Export the data as CSV files, and use those files to create a Vertex AI managed dataset.
Correct answer: A
Explanation
Option A is correct because it allows for efficient data preprocessing within BigQuery and directly creates a new table that can be used as a data source for a Vertex AI managed dataset. Options B and D introduce unnecessary complexity with additional services and formats, while option C requires exporting data to CSV, which is less efficient than using a new table directly in BigQuery.