Databricks Certified Associate Developer for Apache Spark — Question 117
The code block shown below contains an error. The code block is intended to return a collection of summary statistics for column sqft in Data Frame storesDF. Identify the error.
Code block:
storesDF.describes(col(“sgft”))
Answer options
- A. The column sqft should be subsetted from DataFrame storesDF prior to computing summary statistics on it alone.
- B. The describe() operation does not accept a Column object as an argument outside of a sequence — the sequence Seq(col(“sqft”)) should be specified instead.
- C. The describe()operation doesn’t compute summary statistics for a single column — the summary() operation should be used instead.
- D. The describe()operation doesn't compute summary statistics for numeric columns — the summary() operation should be used instead.
- E. The describe()operation does not accept a Column object as an argument — the column name string “sqft” should be specified instead.
Correct answer: E
Explanation
The correct answer is E because the describe() function requires a string representing the column name rather than a Column object. Options A, B, C, and D are incorrect as they suggest alternative approaches or options that are not applicable to the requirements of the describe() function.