Google Cloud Professional Data Engineer — Question 229
You have created an external table for Apache Hive partitioned data that resides in a Cloud Storage bucket, which contains a large number of files. You notice that queries against this table are slow. You want to improve the performance of these queries. What should you do?
Answer options
- A. Change the storage class of the Hive partitioned data objects from Coldline to Standard.
- B. Create an individual external table for each Hive partition by using a common table name prefix. Use wildcard table queries to reference the partitioned data.
- C. Upgrade the external table to a BigLake table. Enable metadata caching for the table.
- D. Migrate the Hive partitioned data objects to a multi-region Cloud Storage bucket.
Correct answer: C
Explanation
Upgrading the external table to a BigLake table and enabling metadata caching can significantly improve query performance by optimizing how data is accessed and cached. Changing the storage class, creating individual tables for each partition, or migrating to a multi-region bucket may help in some scenarios but do not directly address the performance issues associated with querying the external table.