Databricks Certified Associate Developer for Apache Spark — Question 115

Which of the following code blocks returns the number of rows in DataFrame storesDF for each distinct combination of values in column division and column storeCategory?

Answer options

Correct answer: C

Explanation

The correct answer is C because it uses the groupBy method correctly with the column names in string format to group the DataFrame by both 'division' and 'storeCategory'. Options A and E use an unnecessary Seq wrapper, while option D incorrectly groups by 'StoreCategory' with a capital 'S', which does not match the original column name.