Google Cloud Professional Data Engineer — Question 235
You need to look at BigQuery data from a specific table multiple times a day. The underlying table you are querying is several petabytes in size, but you want to filter your data and provide simple aggregations to downstream users. You want to run queries faster and get up-to-date insights quicker. What should you do?
Answer options
- A. Run a scheduled query to pull the necessary data at specific intervals dally.
- B. Use a cached query to accelerate time to results.
- C. Limit the query columns being pulled in the final result.
- D. Create a materialized view based off of the query being run.
Correct answer: D
Explanation
Creating a materialized view (Option D) allows for pre-computed results, which significantly speeds up query performance, especially for large datasets like several petabytes. Option A involves running scheduled queries but may not provide real-time insights. Option B suggests using cached queries, which may not be as effective for frequently updated data. Option C can help improve query performance but does not offer the same level of efficiency as a materialized view.