Databricks Certified Associate Developer for Apache Spark — Question 96
The code block shown below contains an error. The code block is intended to create a single-column DataFrame from Python list years which is made up of integers. Identify the error.
Code block:
spark.createDataFrame(years, IntegerType)
Answer options
- A. The column name must be specified.
- B. The years list should be wrapped in another list like [years] to make clear that it is a column rather than a row.
- C. There is no createDataFrame operation in spark.
- D. The IntegerType call must be followed by parentheses.
- E. The IntegerType call should not be present — Spark can tell that list years is full of integers.
Correct answer: D
Explanation
The correct answer is D because the IntegerType function must be called with parentheses to be properly recognized in the createDataFrame method. Option A is incorrect because the column name is not required for creating a DataFrame from a list. Option B is also incorrect as the list does not need to be wrapped for this operation. Option C is false as createDataFrame is a valid method in Spark, and Option E is misleading since specifying the data type can still be necessary for clarity.