Databricks Certified Data Engineer Associate — Question 63
A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL.
Which command could the data engineering team use to access sales in PySpark?
Answer options
- A. SELECT * FROM sales
- B. spark.table("sales")
- C. spark.sql("sales")
- D. spark.delta.table("sales")
Correct answer: B
Explanation
The correct answer is B, as 'spark.table("sales")' is the appropriate method to access a table in PySpark. Option A is a SQL command not applicable in PySpark directly, while option C attempts to execute a SQL command but does not correctly reference the table. Option D is also incorrect because 'spark.delta.table' is not a valid command for accessing a Delta table in PySpark.