Databricks Certified Associate Developer for Apache Spark — Question 36
The code block shown below contains an error. The code block is intended to use SQL to return a new DataFrame containing column storeId and column managerName from a table created from DataFrame storesDF. Identify the error.
Code block:
storesDF.createOrReplaceTempView("stores")
storesDF.sql("SELECT storeId, managerName FROM stores")
Answer options
- A. The createOrReplaceTempView() operation does not make a Dataframe accessible via SQL.
- B. The sql() operation should be accessed via the spark variable rather than DataFrame storesDF.
- C. There is the sql() operation in DataFrame storesDF. The operation query() should be used instead.
- D. This cannot be accomplished using SQL – the DataFrame API should be used instead.
- E. The createOrReplaceTempView() operation should be accessed via the spark variable rather than DataFrame storesDF.
Correct answer: B
Explanation
The correct answer is B because the sql() method is part of the SparkSession, not the DataFrame itself, and thus should be accessed via the spark variable. Options A, C, D, and E are incorrect as they either misunderstand how SQL operations work with DataFrames in Spark or incorrectly suggest alternatives that do not apply to the given scenario.