Databricks Certified Associate Developer for Apache Spark — Question 29

The code block shown below contains an error. The code block is intended to return a new DataFrame with the mean of column sqft from DataFrame storesDF in column sqftMean. Identify the error.
Code block:
storesDF.agg(mean("sqft").alias("sqftMean"))

Answer options

Correct answer: A

Explanation

The correct answer is A because the mean() function requires a Column object as an argument, not a string representation of the column name. Options B and C are incorrect as quoting the column name is necessary, and mean() is not a method of the Column object. Option D misrepresents the use of agg(), which is indeed appropriate for aggregating data, and option E is incorrect since there are multiple ways to compute means, including aggregation functions.