Databricks Certified Data Engineer Associate — Question 29
A data engineer has realized that the data files associated with a Delta table are incredibly small. They want to compact the small files to form larger files to improve performance.
Which of the following keywords can be used to compact the small files?
Answer options
- A. REDUCE
- B. OPTIMIZE
- C. COMPACTION
- D. REPARTITION
- E. VACUUM
Correct answer: B
Explanation
The correct answer is OPTIMIZE, as it is specifically designed to compact small files in a Delta table into larger ones, improving query performance. The other options do not serve the purpose of file compaction: REDUCE suggests minimizing data, COMPACTION is not a recognized keyword in this context, REPARTITION changes the number of partitions without necessarily merging files, and VACUUM is used for cleaning up old data rather than compacting files.