You are profiling the performance of your TensorFlow model training time and notice a per…

Question

You are profiling the performance of your TensorFlow model training time and notice a performance issue caused by inefficiencies in the input data pipeline for a single 5 terabyte CSV file dataset on Cloud Storage. You need to optimize the input pipeline performance. Which action should you try first to increase the efficiency of your pipeline?

Accepted Answer

Correct answer: C. C. Split into multiple CSV files and use a parallel interleave transformation. — The best first action to improve the input pipeline's efficiency is to split the CSV into multiple files and use a parallel interleave transformation (Option C). This allows for better parallel processing and reduces bottlenecks in data loading. The other options may improve performance but not as effectively as splitting the data for parallel processing.

Google Cloud Professional Machine Learning Engineer — Question 77

Answer options

Correct answer: C

Explanation