Google Cloud Professional Machine Learning Engineer — Question 285

You are developing a TensorFlow Extended (TFX) pipeline with standard TFX components. The pipeline includes data preprocessing steps. After the pipeline is deployed to production, it will process up to 100 TB of data stored in BigQuery. You need the data preprocessing steps to scale efficiently, publish metrics and parameters to Vertex AI Experiments, and track artifacts by using Vertex ML Metadata. How should you configure the pipeline run?

Answer options

Correct answer: B

Explanation

The correct answer is B because running the TFX pipeline in Vertex AI Pipelines with the right Apache Beam parameters allows for efficient scaling and processing using Dataflow, which is optimized for large datasets. Options A and C do not utilize Dataflow for data preprocessing, and while D uses Dataflow, it does not align with the requirement of running the pipeline in Vertex AI Pipelines.