Which of the following operations can be used to return a new DataFrame from DataFrame st…

Question

Which of the following operations can be used to return a new DataFrame from DataFrame storesDF without inducing a shuffle?

Accepted Answer

Correct answer: D. D. storesDF.coalesce(1) — The correct answer is D, as the coalesce operation reduces the number of partitions without causing a shuffle, maintaining the existing data locality. In contrast, options A, B, and C could induce a shuffle due to their nature of merging or partitioning data differently. Option E simply returns the number of partitions and does not create a new DataFrame.

Databricks Certified Associate Developer for Apache Spark — Question 37

Answer options

Correct answer: D

Explanation