Google Cloud Professional Machine Learning Engineer — Question 305

You are building an ML pipeline to process and analyze both steaming and batch datasets. You need the pipeline to handle data validation, preprocessing, model training, and model deployment in a consistent and automated way. You want to design an efficient and scalable solution that captures model training metadata and is easily reproducible. You want to be able to reuse custom components for different parts of your pipeline. What should you do?

Answer options

Correct answer: D

Explanation

The correct answer is D because an orchestration framework like Kubeflow Pipelines or Vertex AI Pipelines is designed for managing ML workflows, enabling automation, and ensuring reproducibility. Options A and B focus on data processing but do not provide the orchestration needed for the entire pipeline. Option C is related to building and pushing images but does not address orchestration or the full workflow management required.