Databricks Certified Associate Developer for Apache Spark — Question 74
Which of the following code blocks returns a new DataFrame with a new column customerSatisfactionAbs that is the absolute value of column customerSatisfaction in DataFrame storesDF? Note that column customerSatisfactionAbs is not in the original DataFrame storesDF.
Answer options
- A. storesDF.withColumn(“customerSatisfactionAbs”, abs(col(“customerSatisfaction”)))
- B. storesDF.withColumnRenamed(“customerSatisfactionAbs”, abs(col(“customerSatisfaction”)))
- C. storesDF.withColumn(col(“customerSatisfactionAbs”, abs(col(“customerSatisfaction”)))
- D. storesDF.withColumn(“customerSatisfactionAbs”, abs(col(customerSatisfaction)))
- E. storesDF.withColumn(“customerSatisfactionAbs”, abs(“customerSatisfaction”))
Correct answer: A
Explanation
Option A is correct because it properly uses the withColumn method to add a new column named customerSatisfactionAbs, where the absolute value of customerSatisfaction is calculated. Option B incorrectly attempts to rename a column that does not exist. Option C has a syntax error by missing a closing parenthesis. Option D incorrectly uses col() without quotes around the column name, which will cause an error. Option E uses the abs function incorrectly by passing a string instead of a column reference.