Databricks Certified Associate Developer for Apache Spark — Question 39

Which of the following Spark properties is used to configure whether DataFrame partitions that do not meet a minimum size threshold are automatically coalesced into larger partitions during a shuffle?

Answer options

Correct answer: E

Explanation

The correct answer, spark.sql.adaptive.coalescePartitions.enabled, specifically controls the automatic coalescing of small partitions during shuffles. The other options do not relate directly to this functionality; for instance, spark.sql.shuffle.partitions defines the number of partitions to create, while spark.sql.autoBroadcastJoinThreshold deals with broadcast join size limits.