Databricks Certified Data Engineer Professional — Question 13

Which configuration parameter directly affects the size of a spark-partition upon ingestion of data into Spark?

Answer options

Correct answer: A

Explanation

The correct answer, spark.sql.files.maxPartitionBytes, directly controls the maximum size of each partition created during data ingestion. The other options, while relevant to Spark's performance and behavior, do not specifically dictate the size of partitions at the time of data ingestion.