Databricks Certified Associate Developer for Apache Spark — Question 202

The code block shown below should create a Python UDF assessPerformanceUDF using the Python function assessPerformance() and apply it to column customerSatisfaction in DataFrame storesDF. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.

Code block:

assessPerformanceUDF = _1_(_2_)
storesDF.withColumn(“result”, _3_(_4_))

Answer options

Correct answer: B

Explanation

The correct answer is B because it correctly uses the udf function to define assessPerformanceUDF, references the assessPerformance function without an additional return type, and applies this UDF to the customerSatisfaction column. Options A and D incorrectly add IntegerType(), while C uses spark.register.udf, which is not the correct syntax for defining a UDF. Option E incorrectly uses quotes around the column name, which is not the correct syntax for referencing a column in this context.