Which of the following describes the difference between DataFrame.repartition(n) and Data…

Question

Which of the following describes the difference between DataFrame.repartition(n) and DataFrame.coalesce(n)?

Accepted Answer

Correct answer: A. A. DataFrame.repartition(n) will split a DataFrame into n number of new partitions with data distributed evenly.
DataFrame.coalesce(n) will more quickly combine the existing partitions of a DataFrame but might result in an uneven distribution of data across the new partitions. — The correct answer is A because DataFrame.repartition(n) creates n new partitions with balanced data distribution, while DataFrame.coalesce(n) merges partitions quickly but may lead to uneven data distribution. Options B, C, D, and E provide inaccurate descriptions of the functionality and efficiency of these methods.

Databricks Certified Associate Developer for Apache Spark — Question 73

Answer options

Correct answer: A

Explanation