Google Cloud Professional Data Engineer — Question 67
You have a requirement to insert minute-resolution data from 50,000 sensors into a BigQuery table. You expect significant growth in data volume and need the data to be available within 1 minute of ingestion for real-time analysis of aggregated trends. What should you do?
Answer options
- A. Use bq load to load a batch of sensor data every 60 seconds.
- B. Use a Cloud Dataflow pipeline to stream data into the BigQuery table.
- C. Use the INSERT statement to insert a batch of data every 60 seconds.
- D. Use the MERGE statement to apply updates in batch every 60 seconds.
Correct answer: B
Explanation
The correct choice is B because a Cloud Dataflow pipeline allows for real-time data streaming into BigQuery, accommodating high data volume and ensuring quick availability for analysis. Options A and C involve batch processing, which does not meet the requirement for immediate data availability. Option D, while it supports batch updates, is not optimized for real-time ingestion.