Databricks Certified Data Engineer Professional — Question 49

Which statement describes the correct use of pyspark.sql.functions.broadcast?

Answer options

Correct answer: D

Explanation

The correct answer is D because pyspark.sql.functions.broadcast is used to indicate that a DataFrame can fit into memory on all executors, which is essential for optimizing broadcast joins. Options A and B incorrectly refer to columns instead of DataFrames, while C and E misrepresent the function's behavior regarding caching and storage locations.