Google Cloud Professional Machine Learning Engineer — Question 208
You are training a custom language model for your company using a large dataset. You plan to use the Reduction Server strategy on Vertex AI. You need to configure the worker pools of the distributed training job. What should you do?
Answer options
- A. Configure the machines of the first two worker pools to have GPUs, and to use a container image where your training code runs. Configure the third worker pool to have GPUs, and use the reductionserver container image.
- B. Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs. Configure the third worker pool to use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth.
- C. Configure the machines of the first two worker pools to have TPUs and to use a container image where your training code runs. Configure the third worker pool without accelerators, and use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth.
- D. Configure the machines of the first two pools to have TPUs, and to use a container image where your training code runs. Configure the third pool to have TPUs, and use the reductionserver container image.
Correct answer: B
Explanation
The correct answer is B because it specifies using GPUs for the first two worker pools with a training code container and the reductionserver container image for the third pool without accelerators, ensuring bandwidth optimization. Option A incorrectly suggests that the third worker pool should also have GPUs, which is not necessary. Option C incorrectly employs TPUs instead of GPUs where required, and option D also misuses TPUs for the third pool, which should not have accelerators.