Google Cloud Professional Data Engineer — Question 210

You have terabytes of customer behavioral data streaming from Google Analytics into BigQuery daily. Your customers’ information, such as their preferences, is hosted on a Cloud SQL for MySQL database. Your CRM database is hosted on a Cloud SQL for PostgreSQL instance. The marketing team wants to use your customers’ information from the two databases and the customer behavioral data to create marketing campaigns for yearly active customers. You need to ensure that the marketing team can run the campaigns over 100 times a day on typical days and up to 300 during sales. At the same time, you want to keep the load on the Cloud SQL databases to a minimum. What should you do?

Answer options

Correct answer: C

Explanation

The correct answer is C because replicating the necessary tables from both Cloud SQL databases to BigQuery using Datastream allows for efficient querying without putting additional load on the Cloud SQL databases. Options A and B do not sufficiently minimize the load on Cloud SQL, while D introduces unnecessary complexity with a Dataproc cluster.