Google Cloud Professional Data Engineer — Question 293
You are migrating an application that tracks library books and information about each book, such as author or year published, from an on-premises data warehouse to BigQuery. In your current relational database, the author information is kept in a separate table and joined to the book information on a common key. Based on Google's recommended practice for schema design, how would you structure the data to ensure optimal speed of queries about the author of each book that has been borrowed?
Answer options
- A. Keep the schema the same, maintain the different tables for the book and each of the attributes, and query as you are doing today.
- B. Create a table that is wide and includes a column for each attribute, including the author's first name, last name, date of birth, etc.
- C. Create a table that includes information about the books and authors, but nest the author fields inside the author column.
- D. Keep the schema the same, create a view that joins all of the tables, and always query the view.
Correct answer: C
Explanation
The correct answer is C because nesting the author fields within the author column allows for efficient access to related data in BigQuery, reducing the need for joins and enhancing query performance. Options A and D maintain separate tables, which can slow down queries due to the need for joins. Option B creates a wide table, which could lead to redundancy and inefficiencies in data management.