Google Cloud Professional Machine Learning Engineer — Question 149

You are training an object detection machine learning model on a dataset that consists of three million X-ray images, each roughly 2 GB in size. You are using Vertex AI Training to run a custom training application on a Compute Engine instance with 32-cores, 128 GB of RAM, and 1 NVIDIA P100 GPU. You notice that model training is taking a very long time. You want to decrease training time without sacrificing model performance. What should you do?

Answer options

Correct answer: D

Explanation

The correct choice is D because using the tf.distribute.Strategy API allows for distributed training, which can significantly speed up training time by leveraging multiple devices. Options A and B may not effectively reduce training time; increasing memory or switching to a less capable GPU could hinder performance. Option C, while useful for stopping training early, does not address the fundamental issue of long training duration.