You are designing a stateful data processing pipeline that reads data from a Cloud Storag…

Question

You are designing a stateful data processing pipeline that reads data from a Cloud Storage bucket and writes transformed data to a BigQuery table. The pipeline must be highly available and resilient to zonal failures within the us-central1 region. You need to configure a Dataflow pipeline ensuring minimal disruption during a zonal outage. What should you do?

Accepted Answer

Correct answer: A. A. Launch the Dataflow job with the --region=us-central1 parameter. — The correct answer is A because launching the Dataflow job with the --region=us-central1 parameter allows the job to be distributed across multiple zones within that region, enhancing availability and resilience. Options B and C both limit the job to a single zone, making them susceptible to zonal failures, while option D confines the job to a specific zone, which does not provide the necessary redundancy.

Google Cloud Professional Data Engineer — Question 250

Answer options

Correct answer: A

Explanation