Databricks Certified Data Engineer Professional — Question 144
A data architect has heard about Delta Lake’s built-in versioning and time travel capabilities. For auditing purposes, they have a requirement to maintain a full record of all valid street addresses as they appear in the customers table.
The architect is interested in implementing a Type 1 table, overwriting existing records with new values and relying on Delta Lake time travel to support long-term auditing. A data engineer on the project feels that a Type 2 table will provide better performance and scalability.
Which piece of information is critical to this decision?
Answer options
- A. Data corruption can occur if a query fails in a partially completed state because Type 2 tables require setting multiple fields in a single update.
- B. Shallow clones can be combined with Type 1 tables to accelerate historic queries for long-term versioning.
- C. Delta Lake time travel cannot be used to query previous versions of these tables because Type 1 changes modify data files in place.
- D. Delta Lake time travel does not scale well in cost or latency to provide a long-term versioning solution.
Correct answer: D
Explanation
The correct answer is D because Delta Lake time travel can incur high costs and latency issues, making it unsuitable for long-term versioning. Type 1 tables overwrite data in place, which can make time travel less effective. Options A, B, and C either misrepresent the functionality of Delta Lake or don't address the efficiency concerns related to time travel in this context.