Databricks Certified Associate Developer for Apache Spark — Question 80

Which of the following storage levels should be used to store as much data as possible in memory on two cluster nodes while storing any data that does not fit in memory on disk to be read in when needed?

Answer options

Correct answer: D

Explanation

The correct answer, MEMORY_AND_DISK_2, allows for storing data in memory across two nodes, ensuring maximum utilization of available memory. The other options either do not provide the same level of redundancy (like MEMORY_ONLY) or do not utilize both memory and disk effectively (like MEMORY_AND_DISK).