Databricks Certified Associate Developer for Apache Spark — Question 39
Which of the following Spark properties is used to configure whether DataFrame partitions that do not meet a minimum size threshold are automatically coalesced into larger partitions during a shuffle?
Answer options
- A. spark.sql.shuffle.partitions
- B. spark.sql.autoBroadcastJoinThreshold
- C. spark.sql.adaptive.skewJoin.enabled
- D. spark.sql.inMemoryColumnarStorage.batchSize
- E. spark.sql.adaptive.coalescePartitions.enabled
Correct answer: E
Explanation
The correct answer, spark.sql.adaptive.coalescePartitions.enabled, specifically controls the automatic coalescing of small partitions during shuffles. The other options do not relate directly to this functionality; for instance, spark.sql.shuffle.partitions defines the number of partitions to create, while spark.sql.autoBroadcastJoinThreshold deals with broadcast join size limits.