Google Cloud Professional Data Engineer — Question 194
You created a new version of a Dataflow streaming data ingestion pipeline that reads from Pub/Sub and writes to BigQuery. The previous version of the pipeline that runs in production uses a 5-minute window for processing. You need to deploy the new version of the pipeline without losing any data, creating inconsistencies, or increasing the processing latency by more than 10 minutes. What should you do?
Answer options
- A. Update the old pipeline with the new pipeline code.
- B. Snapshot the old pipeline, stop the old pipeline, and then start the new pipeline from the snapshot.
- C. Drain the old pipeline, then start the new pipeline.
- D. Cancel the old pipeline, then start the new pipeline.
Correct answer: C
Explanation
The correct choice is C because draining the old pipeline allows it to finish processing current data before switching to the new version, preventing data loss and inconsistencies. Option A is incorrect as updating the old pipeline directly could lead to issues. Option B introduces unnecessary complexity with snapshots, and D could result in data loss due to immediate cancellation of the old pipeline.