Databricks Certified Data Engineer Professional — Question 173
A team of data engineers are adding tables to a DLT pipeline that contain repetitive expectations for many of the same data quality checks. One member of the team suggests reusing these data quality rules across all tables defined for this pipeline.
What approach would allow them to do this?
Answer options
- A. Add data quality constraints to tables in this pipeline using an external job with access to pipeline configuration files.
- B. Use global Python variables to make expectations visible across DLT notebooks included in the same pipeline.
- C. Maintain data quality rules in a separate Databricks notebook that each DLT notebook or file can import as a library.
- D. Maintain data quality rules in a Delta table outside of this pipeline's target schema, providing the schema name as a pipeline parameter.
Correct answer: D
Explanation
Option D is correct because it allows the reuse of data quality rules across multiple tables by storing them in a Delta table that is not tied to a specific schema, making them accessible through the pipeline parameter. Options A and B do not provide a scalable way to share rules across tables, while option C, although useful, does not offer the same integration and flexibility as using a Delta table.