Databricks Certified Associate Developer for Apache Spark — Question 201

A data engineer is investigating a Spark cluster that is experiencing underutilization during scheduled batch jobs. After checking the Spark logs, they noticed that tasks are often getting killed due to timeout errors, and there are several warnings about insufficient resources in the logs.

Which action should the engineer take to resolve the underutilization issue?

Answer options

Correct answer: C

Explanation

Increasing the number of executor instances allows for more concurrent tasks to be processed, addressing the underutilization issue directly. While increasing memory or adjusting timeouts may help with individual task performance, they do not directly resolve the capacity limitation. Reducing partition size could improve scheduling but does not necessarily address the core issue of underutilization.