Google Cloud Professional Cloud Database Engineer — Question 131
You are building a data warehouse on BigQuery. Sources of data include several MySQL databases located on-premises. You need to transfer data from these databases into BigQuery for analytics. You want to use a managed solution that has low latency and is easy to set up. What should you do?
Answer options
- A. Use Datastream to connect to your on-premises database and create a stream. Have Datastream write to Cloud Storage. Then use Dataflow to process the data into BigQuery.
- B. Use Cloud Data Fusion and scheduled workflows to extract data from MySQL. Transform this data into the appropriate schema, and load this data into your BigQuery database.
- C. Use Database Migration Service to replicate data to a Cloud SQL for MySQL instance. Create federated tables in BigQuery on top of the replicated instances to transform and load the data into your BigQuery database.
- D. Create extracts from your on-premises databases periodically, and push these extracts to Cloud Storage. Upload the changes into BigQuery, and merge them with existing tables.
Correct answer: A
Explanation
The correct answer is A because Datastream provides a managed service specifically designed for low-latency data integration, which is ideal for streaming data into BigQuery. Option B, while effective, may not be as low-latency or straightforward as Datastream. Option C involves setting up federated tables, which can add complexity and may not provide the immediate data access desired. Option D requires manual extraction and merging, which is less efficient than using a managed solution like Datastream.