Which of the following operations is least likely to result in a shuffle?

Question

Accepted Answer

Correct answer: B. B. DataFrame.fliter() — The correct answer is B, DataFrame.fliter(), as filtering data typically does not require reshuffling the entire dataset. In contrast, operations like join, orderBy, distinct, and intersect usually necessitate shuffling to align or combine data from different partitions.

Databricks Certified Associate Developer for Apache Spark — Question 144

Answer options

Correct answer: B

Explanation