Databricks Certified Data Engineer Professional — Question 38

Which of the following is true of Delta Lake and the Lakehouse?

Answer options

Correct answer: B

Explanation

Option B is correct because Delta Lake does collect statistics on the first 32 columns of tables, which helps optimize query performance through data skipping. Option A is incorrect as Parquet compresses data based on patterns, not just repetition. Option C is wrong because views may not always maintain an up-to-date cache of source tables. Option D is misleading since primary and foreign key constraints do not inherently prevent duplicates in all scenarios. Option E is false because Z-order can be applied to non-numeric types as well.