Google Cloud Professional Data Engineer — Question 35
Your globally distributed auction application allows users to bid on items. Occasionally, users place identical bids at nearly identical times, and different application servers process those bids. Each bid event contains the item, amount, user, and timestamp. You want to collate those bid events into a single location in real time to determine which user bid first. What should you do?
Answer options
- A. Create a file on a shared file and have the application servers write all bid events to that file. Process the file with Apache Hadoop to identify which user bid first.
- B. Have each application server write the bid events to Cloud Pub/Sub as they occur. Push the events from Cloud Pub/Sub to a custom endpoint that writes the bid event information into Cloud SQL.
- C. Set up a MySQL database for each application server to write bid events into. Periodically query each of those distributed MySQL databases and update a master MySQL database with bid event information.
- D. Have each application server write the bid events to Google Cloud Pub/Sub as they occur. Use a pull subscription to pull the bid events using Google Cloud Dataflow. Give the bid for each item to the user in the bid event that is processed first.
Correct answer: D
Explanation
The correct answer, D, allows for real-time processing of bid events through Google Cloud Pub/Sub and Google Cloud Dataflow, ensuring that the first bid is accurately determined. Option A is less efficient due to the reliance on a file and batch processing with Apache Hadoop, which does not provide real-time results. Option B also involves Cloud Pub/Sub but lacks the efficient processing of events in real-time. Option C is inefficient because it requires periodic queries across multiple databases, delaying the identification of the first bid.