Databricks Certified Associate Developer for Apache Spark — Question 81
In what order should the below lines of code be run in order to write DataFrame storesDF to file path filePath as parquet and partition by values in column division?
Lines of code:
1. .write() \
2. .partitionBy("division") \
3. .parquet(filePath)
4. .storesDF \
5. .repartition("division")
6. .write \
7. .path(filePath, "parquet")
Answer options
- A. 4, 1, 2, 3
- B. 4, 1, 5, 7
- C. 4, 6, 2, 3
- D. 4, 1, 5, 3
- E. 4, 6, 2, 7
Correct answer: C
Explanation
The correct sequence is C: you first reference the DataFrame (4), then call the write method (6), specify the partitioning (2), and finally write to the parquet file (3). The other options either miss the correct method calls or use incorrect parameters, which would lead to errors in execution.