Databricks Certified Data Engineer Professional — Question 115
A Databricks SQL dashboard has been configured to monitor the total number of records present in a collection of Delta Lake tables using the following query pattern:
SELECT COUNT (*) FROM table -
Which of the following describes how results are generated each time the dashboard is updated?
Answer options
- A. The total count of rows is calculated by scanning all data files
- B. The total count of rows will be returned from cached results unless REFRESH is run
- C. The total count of records is calculated from the Delta transaction logs
- D. The total count of records is calculated from the parquet file metadata
Correct answer: C
Explanation
The correct answer is C because the Delta transaction logs keep track of all changes and provide an efficient way to retrieve the current state of the data without scanning all files. Option A is incorrect as it suggests a less efficient method of counting. Option B is also wrong since cached results do not provide the most up-to-date count unless explicitly refreshed, and D is inaccurate because parquet file metadata does not track row counts in the same way as Delta logs.