Google Cloud Professional Machine Learning Engineer — Question 304
You are training a large-scale deep learning model on a Cloud TPU. While monitoring the training progress through Tensorboard, you observe that the TPU utilization is consistently low and there are delays between the completion of one training step and the start of the next step. You want to improve TPU utilization and overall training performance. How should you address this issue?
Answer options
- A. Apply tf.data.Detaset.map with vectorized operations and parallelization.
- B. Use tf.data.Detaset.interleave with multiple data sources.
- C. Use tf.data.Detaset.cache on the dataset after the first epoch.
- D. Implement tf.data.Detaset.prefetch in the data pipeline.
Correct answer: D
Explanation
The correct answer is D, as implementing tf.data.Detaset.prefetch allows the data pipeline to load data in advance, ensuring that the TPU has a steady supply of data to process, thus improving utilization. Options A, B, and C may optimize data processing but do not specifically address the preloading of data to minimize idle time between training steps.