Databricks Certified Data Engineer Professional — Question 151
Which configuration parameter directly affects the size of a spark-partition upon ingestion of data into Spark?
Answer options
- A. spark.sql.files.maxPartitionBytes
- B. spark.sql.autoBroadcastJoinThreshold
- C. spark.sql.adaptive.advisoryPartitionSizeInBytes
- D. spark.sql.adaptive.coalescePartitions.minPartitionNum
Correct answer: A
Explanation
The correct answer is A, as spark.sql.files.maxPartitionBytes specifically determines the maximum size of each partition when data is read into Spark. The other options relate to different aspects of performance tuning and optimization, but do not directly control the partition size during data ingestion.