Databricks Certified Associate Developer for Apache Spark — Question 66

The code block shown below should return a new DataFrame with the mean of column sqft from DataFrame storesDF in column sqftMean. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.

Code block:

storesDF.__1__(__2__(__3__).alias("sqftMean"))

Answer options

Correct answer: A

Explanation

The correct answer is A because the agg function is used to aggregate data, and mean is the correct statistical function to calculate the average of the sqft column, which is referenced using col("sqft"). Option B is incorrect because withColumn does not aggregate; Option C uses 'average' instead of 'mean', which is not valid in this context; Options D and E incorrectly format the reference to the sqft column.