Google Cloud Associate Data Practitioner — Question 22
You need to create a data pipeline that streams event information from applications in multiple Google Cloud regions into BigQuery for near real-time analysis. The data requires transformation before loading. You want to create the pipeline using a visual interface. What should you do?
Answer options
- A. Push event information to a Pub/Sub topic. Create a Dataflow job using the Dataflow job builder.
- B. Push event information to a Pub/Sub topic. Create a Cloud Run function to subscribe to the Pub/Sub topic, apply transformations, and insert the data into BigQuery.
- C. Push event information to a Pub/Sub topic. Create a BigQuery subscription in Pub/Sub.
- D. Push event information to Cloud Storage, and create an external table in BigQuery. Create a BigQuery scheduled job that executes once each day to apply transformations.
Correct answer: A
Explanation
Option A is correct because it involves pushing data to a Pub/Sub topic and using the Dataflow job builder, which allows for visual pipeline creation and necessary transformations before loading data into BigQuery. Option B, while it uses a Pub/Sub topic, relies on Cloud Run which does not provide a visual interface for pipeline creation. Option C does not include any transformation process and only sets up a subscription. Option D is not suitable for near real-time analysis as it relies on daily scheduled jobs, which would introduce latency.