A shipping company has live package-tracking data that is sent to an Apache Kafka stream…

Question

A shipping company has live package-tracking data that is sent to an Apache Kafka stream in real time. This is then loaded into BigQuery. Analysts in your company want to query the tracking data in BigQuery to analyze geospatial trends in the lifecycle of a package. The table was originally created with ingest-date partitioning. Over time, the query processing time has increased. You need to copy all the data to a new clustered table. What should you do?

Accepted Answer

Correct answer: B. B. Implement clustering in BigQuery on the package-tracking ID column. — The correct answer is B because implementing clustering on the package-tracking ID column optimizes query performance by organizing the data based on that key, which is likely to be a common element in queries. Options A and C focus on partitioning by other dates, which would not address the performance issue effectively. Option D involves using external storage, which may complicate access and does not directly enhance query performance within BigQuery.

Google Cloud Professional Data Engineer — Question 135

Answer options

Correct answer: B

Explanation