Databricks Certified Associate Developer for Apache Spark — Question 168
Which of the following code blocks returns a DataFrame containing only the rows from DataFrame storesDF where the value in column sqft is less than or equal to 25,000?
Answer options
- A. storesDF.where(storesDF[sqft] > 25000)
- B. storesDF.filter(sqft > 25000)
- C. storesDF.filter("sqft" <= 25000)
- D. storesDF.filter(col("sqft") <= 25000)
- E. storesDF.where(sqft > 25000)
Correct answer: D
Explanation
The correct choice, D, uses the filter method with the col function to properly access the sqft column and apply the condition. Options A, B, and E incorrectly filter for values greater than 25,000, while option C does not use the col function, which is the preferred method for column access in this context.