You are developing a TensorFlow Extended (TFX) pipeline with standard TFX components. The…

Question

You are developing a TensorFlow Extended (TFX) pipeline with standard TFX components. The pipeline includes data preprocessing steps. After the pipeline is deployed to production, it will process up to 100 TB of data stored in BigQuery. You need the data preprocessing steps to scale efficiently, publish metrics and parameters to Vertex AI Experiments, and track artifacts by using Vertex ML Metadata. How should you configure the pipeline run?

Accepted Answer

Correct answer: B. B. Run the TFX pipeline in Vertex AI Pipelines. Set the appropriate Apache Beam parameters in the pipeline to run the data preprocessing steps in Dataflow. — The correct answer is B because running the TFX pipeline in Vertex AI Pipelines with the right Apache Beam parameters allows for efficient scaling and processing using Dataflow, which is optimized for large datasets. Options A and C do not utilize Dataflow for data preprocessing, and while D uses Dataflow, it does not align with the requirement of running the pipeline in Vertex AI Pipelines.

Google Cloud Professional Machine Learning Engineer — Question 285

Answer options

Correct answer: B

Explanation