Databricks Certified Associate Developer for Apache Spark — Question 180

A data scientist is working with a massive dataset that exceeds the memory capacity of a single machine. The data scientist is considering using Apache SparkTM instead of processing the data using traditional single-machine programming languages like standard Python scripts.

Which two advantages does Apache SparkTM offer over a normal single-machine language in this scenario? (Choose two.)

Answer options

Correct answer: B, D

Explanation

The correct answers, B and D, highlight Apache SparkTM's ability to recover from node failures and its capability to distribute processing tasks, which are crucial for handling large datasets. Options A and C are incorrect as Spark does require some coding and is designed to utilize memory effectively, while E is incorrect since Spark can run on commodity hardware.