Databricks Certified Data Engineer Professional — Question 58
A Databricks job has been configured with 3 tasks, each of which is a Databricks notebook. Task A does not depend on other tasks. Tasks B and C run in parallel, with each having a serial dependency on Task A.
If task A fails during a scheduled run, which statement describes the results of this run?
Answer options
- A. Because all tasks are managed as a dependency graph, no changes will be committed to the Lakehouse until all tasks have successfully been completed.
- B. Tasks B and C will attempt to run as configured; any changes made in task A will be rolled back due to task failure.
- C. Unless all tasks complete successfully, no changes will be committed to the Lakehouse; because task A failed, all commits will be rolled back automatically.
- D. Tasks B and C will be skipped; some logic expressed in task A may have been committed before task failure.
- E. Tasks B and C will be skipped; task A will not commit any changes because of stage failure.
Correct answer: D
Explanation
The correct answer is D because if Task A fails, subsequent tasks B and C, which depend on its successful completion, will be skipped. The actions in Task A may have taken effect before the failure, but since Task A was not successful, the overall process halts for the dependent tasks, making A's failure critical to the execution flow.