Databricks Certified Associate Developer for Apache Spark — Question 33
Which of the following code blocks applies the function assessPerformance() to each row of DataFrame storesDF?
Answer options
- A. [assessPerformance(row) for row in storesDF.take(3)]
- B. [assessPerformance() for row in storesDF]
- C. storesDF.collect().apply(lambda: assessPerformance)
- D. [assessPerformance(row) for row in storesDF.collect()]
- E. [assessPerformance(row) for row in storesDF]
Correct answer: D
Explanation
The correct answer is D because it correctly applies the assessPerformance() function to each row of the DataFrame after collecting all rows into a list. Option A only processes the first three rows, B does not pass the row as an argument, and C incorrectly uses apply with a lambda function that does not call assessPerformance with the row parameter. Option E is also incorrect because it does not collect the rows first.