You are migrating a table to BigQuery and are deciding on the data model. Your table stor…

Question

You are migrating a table to BigQuery and are deciding on the data model. Your table stores information related to purchases made across several store locations and includes information like the time of the transaction, items purchased, the store ID, and the city and state in which the store is located. You frequently query this table to see how many of each item were sold over the past 30 days and to look at purchasing trends by state, city, and individual store. How would you model this table for the best query performance?

Accepted Answer

Correct answer: A. A. Partition by transaction time; cluster by state first, then city, then store ID. — The correct answer is A because partitioning by transaction time allows for efficient querying of recent data, while clustering by state, then city, and then store ID optimizes the performance for the specific queries involving geographic locations and trends. Options B and C do not prioritize state in the clustering, which is crucial for the types of queries mentioned, and D lacks the partitioning benefit for time-based queries.

Google Cloud Professional Data Engineer — Question 328

Answer options

Correct answer: A

Explanation