Databricks Certified Associate Developer for Apache Spark — Question 209

A data engineer is working on num_df DataFrame:
num_df = spark.range(5).toDF(“num”)

The engineer is using the Python UDF:
def cubefunc(val):
return val ** 3

Which code fragment registers and uses this UDF as a Spark SQL function to work with the DataFrame num_df?

Answer options

Correct answer: C

Explanation

The correct answer is C because it registers the UDF with the correct return type of DoubleType(), which is suitable for handling the cubed values. Option A incorrectly uses IntegerType(), which may not accommodate all possible cube results. Option B does not register the UDF for use in SQL expressions, and Option D has a syntax error and does not properly register the UDF.