Your infrastructure team has set up an interconnect link between Google Cloud and the on-…

Question

Your infrastructure team has set up an interconnect link between Google Cloud and the on-premises network. You are designing a high-throughput streaming pipeline to ingest data in streaming from an Apache Kafka cluster hosted on- premises. You want to store the data in BigQuery, with as minimal latency as possible. What should you do?

Accepted Answer

Correct answer: C. C. Use Dataflow, write a pipeline that reads the data from Kafka, and writes the data to BigQuery. — The correct choice, C, directly uses Dataflow to read from Kafka and write to BigQuery, providing an efficient and low-latency solution. Option A introduces an unnecessary step of using Pub/Sub, which can add latency, while B complicates the architecture with a proxy host, which is not needed. Option D, while similar to A, also involves Pub/Sub, making it less optimal for minimizing latency.

Google Cloud Professional Data Engineer — Question 159

Answer options

Correct answer: C

Explanation