Databricks Certified Machine Learning Associate — Question 31

The implementation of linear regression in Spark ML first attempts to solve the linear regression problem using matrix decomposition, but this method does not scale well to large datasets with a large number of variables.
Which of the following approaches does Spark ML use to distribute the training of a linear regression model for large data?

Answer options

Correct answer: C

Explanation

The correct answer is C, as Spark ML uses iterative optimization to effectively distribute the training of linear regression models across large datasets. The other options are incorrect because logistic regression is a different algorithm, option B is false as Spark ML can distribute training, option D describes a method rather than an approach to distribution, and option E refers to a matrix decomposition technique, not a distribution method.