Databricks Certified Data Engineer Associate — Question 18
A data engineer has a Job with multiple tasks that runs nightly. Each of the tasks runs slowly because the clusters take a long time to start.
Which of the following actions can the data engineer perform to improve the start up time for the clusters used for the Job?
Answer options
- A. They can use endpoints available in Databricks SQL
- B. They can use jobs clusters instead of all-purpose clusters
- C. They can configure the clusters to be single-node
- D. They can use clusters that are from a cluster pool
- E. They can configure the clusters to autoscale for larger data sizes
Correct answer: D
Explanation
Choosing clusters from a cluster pool allows for faster startup times because the clusters are pre-provisioned and can be allocated quickly. Options A, B, and C do not specifically address the issue of startup time, while option E might improve performance but not the initialization speed of the clusters.