Google Cloud Professional Data Engineer — Question 64
You are deploying 10,000 new Internet of Things devices to collect temperature data in your warehouses globally. You need to process, store and analyze these very large datasets in real time. What should you do?
Answer options
- A. Send the data to Google Cloud Datastore and then export to BigQuery.
- B. Send the data to Google Cloud Pub/Sub, stream Cloud Pub/Sub to Google Cloud Dataflow, and store the data in Google BigQuery.
- C. Send the data to Cloud Storage and then spin up an Apache Hadoop cluster as needed in Google Cloud Dataproc whenever analysis is required.
- D. Export logs in batch to Google Cloud Storage and then spin up a Google Cloud SQL instance, import the data from Cloud Storage, and run an analysis as needed.
Correct answer: B
Explanation
Option B is correct because Google Cloud Pub/Sub allows for real-time messaging and streaming of data, which can then be processed by Google Cloud Dataflow and stored in Google BigQuery for analysis. The other options either involve batch processing or are not optimized for real-time data analysis, making them less suitable for the requirement of handling large datasets in real time.