Google Cloud Professional Data Engineer — Question 150

A web server sends click events to a Pub/Sub topic as messages. The web server includes an eventTimestamp attribute in the messages, which is the time when the click occurred. You have a Dataflow streaming job that reads from this Pub/Sub topic through a subscription, applies some transformations, and writes the result to another Pub/Sub topic for use by the advertising department. The advertising department needs to receive each message within 30 seconds of the corresponding click occurrence, but they report receiving the messages late. Your Dataflow job's system lag is about 5 seconds, and the data freshness is about 40 seconds. Inspecting a few messages show no more than 1 second lag between their eventTimestamp and publishTime. What is the problem and what should you do?

Answer options

Correct answer: G

Explanation

The correct answer is G because it indicates that while individual messages are processed in less than 30 seconds, the Dataflow job is struggling to keep pace with the accumulation of messages in the Pub/Sub subscription. The other options incorrectly assign blame to the advertising department, suggest processing delays, or imply issues with the web server, which are not supported by the provided data.