Databricks Certified Associate Developer for Apache Spark — Question 35

The code block shown below contains an error. The code block is intended to create a Python UDF assessPerformanceUDF() using the integer-returning Python function assessPerformance() and apply it to column customerSatisfaction in DataFrame storesDF. Identify the error.
Code block:
assessPerformanceUDF – udf(assessPerformance)
storesDF.withColumn("result", assessPerformanceUDF(col("customerSatisfaction")))

Answer options

Correct answer: D

Explanation

The correct answer is D because when defining a UDF in PySpark, it is essential to specify the return type of the UDF. Options A, B, C, and E are incorrect as they do not accurately reflect the requirement of specifying the return type when creating a UDF.