Databricks Certified Associate Developer for Apache Spark — Question 165

Which of the following Spark properties is used to configure whether DataFrames found to be below a certain size threshold at runtime will be automatically broadcasted?

Answer options

Correct answer: B

Explanation

The correct answer is B, as 'spark.sql.autoBroadcastJoinThreshold' specifically controls the automatic broadcasting of DataFrames based on their size. Option A relates to the timeout for broadcasting, option C pertains to the number of shuffle partitions, option D specifies the batch size for in-memory columnar storage, and option E is about enabling local shuffle reading, none of which influence the broadcasting behavior directly.