Google Cloud Professional Data Engineer — Question 260
You are using Dataflow to build a streaming data pipeline to analyze user website click activity from Pub/Sub. You need to calculate the number of clicks for each user site visit. A site visit is defined as a period of activity followed by 30 minutes of inactivity for a specific user. What should you do?
Answer options
- A. Use tumbling windows with a 30-minute window.
- B. Use hopping windows with a 30-minute window, and a 1-minute period.
- C. Use hopping windows with a 30-minute window, and a 30-minute period.
- D. Use session windows with a 30-minute gap duration.
Correct answer: D
Explanation
The correct answer is D because session windows are specifically designed to capture periods of activity followed by a defined gap of inactivity, which in this case is 30 minutes. The other options, such as tumbling and hopping windows, do not account for inactivity periods effectively and would not accurately reflect user site visits as defined in the question.