Databricks Certified Associate Developer for Apache Spark — Question 62
The code block shown below contains an error. The code block intended to create a single-column DataFrame from Scala List years which is made up of integers. Identify the error.
Code block:
spark.createDataset(years)
Answer options
- A. The years list should be wrapped in another list like List(years) to make clear that it is a column rather than a row.
- B. The data type is not specified – the second argument to createDataset should be IntegerType.
- C. There is no operation createDataset – the createDataFrame operation should be used instead.
- D. The result of the above is a Dataset rather than a DataFrame – the toDF operation must be called at the end.
- E. The column name must be specified as the second argument to createDataset.
Correct answer: D
Explanation
The correct answer is D because the createDataset method generates a Dataset, while the requirement is for a DataFrame. The others are incorrect as they either refer to incorrect operations, data types, or structural requirements that do not apply to the situation described.