Databricks Certified Machine Learning Associate — Question 12

Which of the following is a benefit of using vectorized pandas UDFs instead of standard PySpark UDFs?

Answer options

Correct answer: B

Explanation

The correct answer is B because vectorized pandas UDFs are designed to process data in batches, which significantly improves performance compared to traditional UDFs that handle one row at a time. The other options, while potentially true about vectorized pandas UDFs, do not specifically highlight the key performance benefit of batch processing.