Databricks Certified Machine Learning Associate — Question 22

A data scientist wants to parallelize the training of trees in a gradient boosted tree to speed up the training process. A colleague suggests that parallelizing a boosted tree algorithm can be difficult.
Which of the following describes why?

Answer options

Correct answer: D

Explanation

The correct answer is D because gradient boosting relies on the results from the previous iteration to inform the next one, making it inherently sequential. The other options either misrepresent how gradient boosting operates or incorrectly describe the requirements for parallelization.