Google Cloud Professional Machine Learning Engineer — Question 242
You are developing an ML model to identify your company’s products in images. You have access to over one million images in a Cloud Storage bucket. You plan to experiment with different TensorFlow models by using Vertex AI Training. You need to read images at scale during training while minimizing data I/O bottlenecks. What should you do?
Answer options
- A. Load the images directly into the Vertex AI compute nodes by using Cloud Storage FUSE. Read the images by using the tf.data.Dataset.from_tensor_slices function
- B. Create a Vertex AI managed dataset from your image data. Access the AIP_TRAINING_DATA_URI environment variable to read the images by using the tf.data.Dataset.list_files function.
- C. Convert the images to TFRecords and store them in a Cloud Storage bucket. Read the TFRecords by using the tf.data.TFRecordDataset function.
- D. Store the URLs of the images in a CSV file. Read the file by using the tf.data.experimental.CsvDataset function.
Correct answer: C
Explanation
The correct answer is C because converting images to TFRecords optimizes data loading and minimizes I/O bottlenecks during training. Options A and B are less efficient for large datasets, while D adds unnecessary complexity by relying on a CSV file, which can slow down the process.