Databricks Certified Associate Developer for Apache Spark — Question 72
The code block shown below contains an error. The code block intended to return a new DataFrame that is the result of an inner join between DataFrame storesDF and DataFrame employeesDF on column storeId. Identify the error.
Code block:
StoresDF.join(employeesDF, Seq("storeId")
Answer options
- A. The key column storeId needs to be a string like “storeId”.
- B. The key column storeId needs to be specified in an expression of both Data Frame columns like storesDF.storeId ===employeesDF.storeId.
- C. The default argument to the joinType parameter is “inner” - an additional argument of “left” must be specified.
- D. There is no DataFrame.join() operation - DataFrame.merge() should be used instead.
- E. The key column storeId needs to be wrapped in the col() operation.
Correct answer: A
Explanation
The correct answer is A because the join method requires the column name to be provided as a string, and without proper quotation marks, it will throw an error. Options B, C, D, and E are incorrect as they do not address the specific issue of the missing string format for the column name in the join method.