Databricks Certified Associate Developer for Apache Spark — Question 100
Which of the following code blocks returns the number of rows in DataFrame storesDF for each unique value in column division?
Answer options
- A. storesDF.groupBy("division").agg(count())
- B. storesDF.agg(groupBy("division").count())
- C. storesDF.groupby.count("division")
- D. storesDF.groupBy().count("division")
- E. storesDF.groupBy("division").count()
Correct answer: E
Explanation
Option E is correct because it accurately groups the DataFrame by the 'division' column and counts the number of occurrences for each unique value. Option A is incorrect as it uses agg incorrectly; B misuses the agg function entirely; C fails to invoke the correct method for counting; and D does not group by the 'division' column properly.