Google Cloud Professional Data Engineer — Question 159

Your infrastructure team has set up an interconnect link between Google Cloud and the on-premises network. You are designing a high-throughput streaming pipeline to ingest data in streaming from an Apache Kafka cluster hosted on- premises. You want to store the data in BigQuery, with as minimal latency as possible. What should you do?

Answer options

Correct answer: C

Explanation

The correct choice, C, directly uses Dataflow to read from Kafka and write to BigQuery, providing an efficient and low-latency solution. Option A introduces an unnecessary step of using Pub/Sub, which can add latency, while B complicates the architecture with a proxy host, which is not needed. Option D, while similar to A, also involves Pub/Sub, making it less optimal for minimizing latency.