Databricks Certified Associate Developer for Apache Spark — Question 42
The code block shown below contains an error. The code block is intended to return a new DataFrame that is the result of a position-wise union between DataFrame storesDF and DataFrame acquiredStoresDF. Identify the error.
Code block:
storesDF.unionByName(acquiredStoresDF)
Answer options
- A. There is no DataFrame.unionByName() operation – the concat() operation should be used instead with both DataFrames as arguments.
- B. There are no key columns specified – similar column names should be the second argument.
- C. The DataFrame.unionByName() operation does not union DataFrames based on column position – it uses column name instead.
- D. The unionByName() operation is a standalone operation rather than a method of DataFrame – it should have both DataFrames as arguments.
- E. There are no column positions specified – the desired column positions should be the second argument.
Correct answer: C
Explanation
The correct answer is C because the unionByName() function merges DataFrames based on matching column names rather than their positions. Option A is incorrect because unionByName() is a valid operation. Options B and E are wrong as the function does not require key columns or column positions. Option D is also incorrect since unionByName() is a method of DataFrame, not a standalone operation.