Google Cloud Professional Machine Learning Engineer — Question 250
You are developing a recommendation engine for an online clothing store. The historical customer transaction data is stored in BigQuery and Cloud Storage. You need to perform exploratory data analysis (EDA), preprocessing and model training. You plan to rerun these EDA, preprocessing, and training steps as you experiment with different types of algorithms. You want to minimize the cost and development effort of running these steps as you experiment. How should you configure the environment?
Answer options
- A. Create a Vertex AI Workbench user-managed notebook using the default VM instance, and use the %%bigquerv magic commands in Jupyter to query the tables.
- B. Create a Vertex AI Workbench managed notebook to browse and query the tables directly from the JupyterLab interface.
- C. Create a Vertex AI Workbench user-managed notebook on a Dataproc Hub, and use the %%bigquery magic commands in Jupyter to query the tables.
- D. Create a Vertex AI Workbench managed notebook on a Dataproc cluster, and use the spark-bigquery-connector to access the tables.
Correct answer: B
Explanation
The correct answer is B because a managed notebook provides a streamlined environment that minimizes setup and maintenance efforts, which is ideal for experimentation. Options A and C involve user-managed notebooks that require additional configuration and maintenance, while option D introduces unnecessary complexity and cost with a Dataproc cluster.