Databricks Certified Associate Developer for Apache Spark — Question 164
Which of the following code blocks returns a DataFrame where rows in DataFrame storesDF containing missing values in every column have been dropped?
Answer options
- A. storesDF.na.drop()
- B. storesDF.dropna()
- C. storesDF.na.drop("all", subset = "sqft")
- D. storesDF.na.drop("all")
- E. storesDF.nadrop("all")
Correct answer: D
Explanation
The correct option D uses the method storesDF.na.drop("all") to effectively remove rows with missing values in all columns. Option A is similar but defaults to dropping only rows with any missing values, while option B is an alternative function for dropping rows but does not specify the criteria for all columns. Option C drops rows based on the 'sqft' subset, and option E contains a syntax error, making it invalid.