Google Cloud Professional Machine Learning Engineer — Question 144
You are developing an image recognition model using PyTorch based on ResNet50 architecture. Your code is working fine on your local laptop on a small subsample. Your full dataset has 200k labeled images. You want to quickly scale your training workload while minimizing cost. You plan to use 4 V100 GPUs. What should you do?
Answer options
- A. Create a Google Kubernetes Engine cluster with a node pool that has 4 V100 GPUs. Prepare and submit a TFJob operator to this node pool.
- B. Create a Vertex AI Workbench user-managed notebooks instance with 4 V100 GPUs, and use it to train your model.
- C. Package your code with Setuptools, and use a pre-built container. Train your model with Vertex AI using a custom tier that contains the required GPUs.
- D. Configure a Compute Engine VM with all the dependencies that launches the training. Train your model with Vertex AI using a custom tier that contains the required GPUs.
Correct answer: C
Explanation
The correct answer is C because packaging your code with Setuptools and using a pre-built container allows for efficient integration and resource management in Vertex AI. Options A and B involve Kubernetes and user-managed notebooks, which are less cost-effective for scaling compared to a custom tier. Option D, while feasible, requires more manual setup which could lead to higher costs and complexity.