The default value of spark.sql.shuffle.partitions is 200. Which of the following describe…

Question

The default value of spark.sql.shuffle.partitions is 200. Which of the following describes what that means?

Accepted Answer

Correct answer: E. E. By default, DataFrames will be split into 200 unique partitions when data is being shuffled. — The correct answer is E because it accurately describes that during the shuffling process, DataFrames are split into 200 partitions to allow for parallel processing. Options A, B, C, and D incorrectly imply that this setting pertains to executor memory usage, only reading a limited number of partitions, or applies to existing DataFrames rather than focusing on the shuffling of data.

Databricks Certified Associate Developer for Apache Spark — Question 52

Answer options

Correct answer: E

Explanation