Databricks Certified Associate Developer for Apache Spark — Question 78

The code block shown below contains an error. The code block is intended to return the exact number of distinct values in column division in DataFrame storesDF. Identify the error.

Code block:

storesDF.agg(approx_count_distinct(col(“division”)).alias(“divisionDistinct”))

Answer options

Correct answer: E

Explanation

The correct answer is E because the approx_count_distinct() function is designed to provide an approximate count rather than an exact number of distinct values. Options A, B, C, and D misinterpret the functionality of approx_count_distinct() or suggest incorrect requirements for its use.