AWS Certified Big Data – Specialty — Question 65
A customer needs to determine the optimal distribution strategy for the ORDERS fact table in its Redshift schema. The ORDERS table has foreign key relationships with multiple dimension tables in this schema.
How should the company determine the most appropriate distribution key for the ORDERS table?
Answer options
- A. Identify the largest and most frequently joined dimension table and ensure that it and the ORDERS table both have EVEN distribution.
- B. Identify the largest dimension table and designate the key of this dimension table as the distribution key of the ORDERS table.
- C. Identify the smallest dimension table and designate the key of this dimension table as the distribution key of the ORDERS table.
- D. Identify the largest and the most frequently joined dimension table and designate the key of this dimension table as the distribution key of the ORDERS table.
Correct answer: D
Explanation
The correct answer is D because selecting the largest and most frequently joined dimension table as the distribution key minimizes data movement during joins, which enhances query performance. Options A and B do not account for the frequency of joins, while option C incorrectly suggests using the smallest dimension table, which could lead to inefficient data distribution and performance issues.