Databricks Certified Associate Developer for Apache Spark — Question 152
The code block shown below should return a new DataFrame that is the result of an inner join between DataFrame storesDF and DataFrame employeesDF on column storeId and column employeeId. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.
Code block:
storesDF.join(employeesDF, [__1__ == __2__, __3__ == __4__])
Answer options
- A. 1. storesDF.storeId 2. storesDF.employeeId 3. employeesDF.storeId 4. employeesDF.employeeId
- B. 1. col("storeId") 2.col("storeId") 3.col("employeeId") 4. col("employeeId")
- C. 1. storeId 2. storeId 3. employeeId 4. employeeId
- D. 1. col("storeId") 2. col("employeeId") 3. col("employeeId") 4. col(''storeId")
- E. 1. storesDF.storeId 2. employeesDF.storeId 3. storesDF.employeeId 4. employeesDF.employeeId
Correct answer: E
Explanation
The correct answer is E because it correctly references the storeId from storesDF and employeesDF, as well as the employeeId from storesDF and employeesDF, which are necessary for the inner join. The other options use incorrect references or the same DataFrame for both columns, which would not yield the intended join results.