Databricks Certified Data Engineer Professional — Question 11
Which statement describes Delta Lake Auto Compaction?
Answer options
- A. An asynchronous job runs after the write completes to detect if files could be further compacted; if yes, an OPTIMIZE job is executed toward a default of 1 GB.
- B. Before a Jobs cluster terminates, OPTIMIZE is executed on all tables modified during the most recent job.
- C. Optimized writes use logical partitions instead of directory partitions; because partition boundaries are only represented in metadata, fewer small files are written.
- D. Data is queued in a messaging bus instead of committing data directly to memory; all data is committed from the messaging bus in one batch once the job is complete.
- E. An asynchronous job runs after the write completes to detect if files could be further compacted; if yes, an OPTIMIZE job is executed toward a default of 128 MB.
Correct answer: E
Explanation
The correct answer, E, accurately describes Delta Lake Auto Compaction as it specifies that an asynchronous job checks for compactable files after a write and may trigger an OPTIMIZE job with a default size of 128 MB. Option A is incorrect due to the incorrect default size of 1 GB, while B, C, and D do not relate to the auto compaction process at all.