Google Cloud Professional Data Engineer — Question 324
You are operating a streaming Cloud Dataflow pipeline. Your engineers have a new version of the pipeline with a different windowing algorithm and triggering strategy. You want to update the running pipeline with the new version. You want to ensure that no data is lost during the update. What should you do?
Answer options
- A. Update the Cloud Dataflow pipeline inflight by passing the --update option with the --jobName set to the existing job name
- B. Update the Cloud Dataflow pipeline inflight by passing the --update option with the --jobName set to a new unique job name
- C. Stop the Cloud Dataflow pipeline with the Cancel option. Create a new Cloud Dataflow job with the updated code
- D. Stop the Cloud Dataflow pipeline with the Drain option. Create a new Cloud Dataflow job with the updated code
Correct answer: D
Explanation
The correct answer is D because using the Drain option allows the current pipeline to finish processing any inflight data before stopping, thus ensuring no data loss. In contrast, using the Cancel option (as in C) may lead to data being lost, while options A and B imply updating the pipeline without properly managing the existing data flow.