Google Cloud Professional Machine Learning Engineer — Question 274

You are using Kubeflow Pipelines to develop an end-to-end PyTorch-based MLOps pipeline. The pipeline reads data from BigQuery, processes the data, conducts feature engineering, model training, model evaluation, and deploys the model as a binary file to Cloud Storage. You are writing code for several different versions of the feature engineering and model training steps, and running each new version in Vertex AI Pipelines. Each pipeline run is taking over an hour to complete. You want to speed up the pipeline execution to reduce your development time, and you want to avoid additional costs. What should you do?

Answer options

Correct answer: B

Explanation

Enabling caching in all steps of the Kubeflow pipeline allows the system to reuse the outputs of previous runs for unchanged steps, significantly reducing execution time for subsequent runs. Commenting out parts of the pipeline does not enhance performance and might hinder testing. Offloading feature engineering to BigQuery can simplify the pipeline but may not directly address execution speed. Adding a GPU can improve training speed but may incur additional costs, which you want to avoid.