Databricks Certified Machine Learning Associate — Question 6

A data scientist has a Spark DataFrame spark_df. They want to create a new Spark DataFrame that contains only the rows from spark_df where the value in column discount is less than or equal 0.
Which of the following code blocks will accomplish this task?

Answer options

Correct answer: C

Explanation

The correct answer is C because it uses the filter method, which is specifically designed for Spark DataFrames to filter rows based on a condition. Options A and B are not valid for Spark DataFrames, as they resemble Pandas syntax, while option D incorrectly uses loc with parentheses instead of brackets, making it invalid.