You are tasked with building an MLOps pipeline to retrain tree-based models in production…

Question

You are tasked with building an MLOps pipeline to retrain tree-based models in production. The pipeline will include components related to data ingestion, data processing, model training, model evaluation, and model deployment. Your organization primarily uses PySpark-based workloads for data preprocessing. You want to minimize infrastructure management effort. How should you set up the pipeline?

Accepted Answer

Correct answer: B. B. Set up a Vertex AI Pipelines to orchestrate the MLOps pipeline. Use the predefined Dataproc component for the PySpark-based workloads. — The correct answer is B because Vertex AI Pipelines provides a streamlined way to orchestrate MLOps workflows, and using the predefined Dataproc component simplifies the integration with PySpark workloads, minimizing infrastructure management. Option A is incorrect as creating a custom component adds unnecessary complexity. Option C involves using Kubeflow, which may require more management than Vertex AI. Option D suggests using Cloud Composer, which is not as optimized for this specific use case as Vertex AI Pipelines.

Google Cloud Professional Machine Learning Engineer — Question 283

Answer options

Correct answer: B

Explanation