Databricks Certified Associate Developer for Apache Spark — Question 52

The default value of spark.sql.shuffle.partitions is 200. Which of the following describes what that means?

Answer options

Correct answer: E

Explanation

The correct answer is E because it accurately describes that during the shuffling process, DataFrames are split into 200 partitions to allow for parallel processing. Options A, B, C, and D incorrectly imply that this setting pertains to executor memory usage, only reading a limited number of partitions, or applies to existing DataFrames rather than focusing on the shuffling of data.