You are designing an Apache Beam pipeline to enrich data from Cloud Pub/Sub with static r…

Question

You are designing an Apache Beam pipeline to enrich data from Cloud Pub/Sub with static reference data from BigQuery. The reference data is small enough to fit in memory on a single worker. The pipeline should write enriched results to BigQuery for analysis. Which job type and transforms should this pipeline use?

Accepted Answer

Correct answer: C. C. Streaming job, PubSubIO, BigQueryIO, side-inputs — The correct answer is C because a streaming job is appropriate for processing real-time data from Cloud Pub/Sub while using side-inputs allows the pipeline to incorporate the static reference data from BigQuery. Options A and B are incorrect as they either suggest a batch job or use JdbcIO, which is not relevant in this context. Option D is also incorrect because it mentions side-outputs, which are not necessary for this task.

Google Cloud Professional Data Engineer — Question 53

Answer options

Correct answer: C

Explanation