A data scientist wants to parallelize the training of trees in a gradient boosted tree to…

Question

A data scientist wants to parallelize the training of trees in a gradient boosted tree to speed up the training process. A colleague suggests that parallelizing a boosted tree algorithm can be difficult.
Which of the following describes why?

Accepted Answer

Correct answer: D. D. Gradient boosting is an iterative algorithm that requires information from the previous iteration to perform the next step. — The correct answer is D because gradient boosting relies on the results from the previous iteration to inform the next one, making it inherently sequential. The other options either misrepresent how gradient boosting operates or incorrectly describe the requirements for parallelization.

Databricks Certified Machine Learning Associate — Question 22

Answer options

Correct answer: D

Explanation