Google Cloud Professional Machine Learning Engineer — Question 67

You lead a data science team at a large international corporation. Most of the models your team trains are large-scale models using high-level TensorFlow APIs on AI Platform with GPUs. Your team usually takes a few weeks or months to iterate on a new version of a model. You were recently asked to review your team’s spending. How should you reduce your Google Cloud compute costs without impacting the model’s performance?

Answer options

Correct answer: C

Explanation

The correct answer is C because using Kuberflow on Google Kubernetes Engine with preemptible VMs allows for significant cost savings while still using checkpoints to save progress during training. Option A and B do not provide the cost benefits of preemptible VMs, and D lacks the checkpointing feature, which is essential for long training processes to avoid losing progress.