Databricks Certified Data Engineer Associate — Question 80
A data engineer has three tables in a Delta Live Tables (DLT) pipeline. They have configured the pipeline to drop invalid records at each table. They notice that some data is being dropped due to quality concerns at some point in the DLT pipeline. They would like to determine at which table in their pipeline the data is being dropped.
Which approach can the data engineer take to identify the table that is dropping the records?
Answer options
- A. They can set up separate expectations for each table when developing their DLT pipeline.
- B. They can navigate to the DLT pipeline page, click on the “Error” button, and review the present errors.
- C. They can set up DLT to notify them via email when records are dropped.
- D. They can navigate to the DLT pipeline page, click on each table, and view the data quality statistics.
Correct answer: D
Explanation
The correct answer is D because it allows the data engineer to directly check the data quality statistics for each table and identify where the drops are occurring. Option A does not provide a direct method to trace dropped records. Option B only shows existing errors, not specifically which table is dropping data. Option C provides notifications but does not help in identifying the source of the dropped records.