Databricks Certified Associate Developer for Apache Spark — Question 76
Which of the following code blocks returns a DataFrame containing only the rows from DataFrame storesDF where the value in column sqft is less than or equal to 25,000 AND the value in column customerSatisfaction is greater than or equal to 30?
Answer options
- A. storesDF.filter(col("sqft") <= 25000 and col("customerSatisfaction") >= 30)
- B. storesDF.filter(col("sqft") <= 25000 or col("customerSatisfaction") >= 30)
- C. storesDF.filter(sqft) <= 25000 and customerSatisfaction >= 30)
- D. storesDF.filter(col("sqft") <= 25000 & col("customerSatisfaction") >= 30)
- E. storesDF.filter(sqft <= 25000) & customerSatisfaction >= 30)
Correct answer: D
Explanation
Option D is correct because it correctly uses the '&' operator to apply both conditions, ensuring that only rows that meet both criteria are included in the DataFrame. Option A incorrectly uses 'and' which is not valid in this context, while option B uses 'or', which does not satisfy the requirement for both conditions. Options C and E also have syntax errors that prevent them from functioning as intended.