Databricks Certified Associate Developer for Apache Spark — Question 204
Which UDF implementation calculates the length of strings in a Spark DataFrame?
Answer options
- A. df.withcolumn("length", spark.udf("len", StringType()))
- B. df.select(length (col("stringColumn")) -alias("Length"))
- C. spark.udf.register("stringLength", lambda s: len(s))
- D. df.withcolumn("length", udf(lambda s: len(s), StringType()))
Correct answer: D
Explanation
Option D is correct because it uses the 'udf' function to create a UDF that calculates the length of a string and applies it to a new column in the DataFrame. Option A incorrectly uses 'spark.udf' instead of 'udf', Option B has a syntax error with 'length' not being defined, and Option C registers a UDF but does not apply it to a DataFrame.