Google Cloud Professional Data Engineer — Question 184
You have an Oracle database deployed in a VM as part of a Virtual Private Cloud (VPC) network. You want to replicate and continuously synchronize 50 tables to BigQuery. You want to minimize the need to manage infrastructure. What should you do?
Answer options
- A. Deploy Apache Kafka in the same VPC network, use Kafka Connect Oracle Change Data Capture (CDC), and Dataflow to stream the Kafka topic to BigQuery.
- B. Create a Pub/Sub subscription to write to BigQuery directly. Deploy the Debezium Oracle connector to capture changes in the Oracle database, and sink to the Pub/Sub topic.
- C. Deploy Apache Kafka in the same VPC network, use Kafka Connect Oracle change data capture (CDC), and the Kafka Connect Google BigQuery Sink Connector.
- D. Create a Datastream service from Oracle to BigQuery, use a private connectivity configuration to the same VPC network, and a connection profile to BigQuery.
Correct answer: D
Explanation
The correct answer is D, as using a Datastream service allows for continuous data replication from Oracle to BigQuery with minimal management overhead. Options A and C involve more complex setups with Kafka, requiring additional management. Option B, while it uses Pub/Sub, does not provide the same level of simplicity and direct integration as the Datastream service.