Databricks Certified Associate Developer for Apache Spark — Question 136
Which of the following best describes the similarities and differences between the MEMORY_ONLY storage level and the MEMORY_AND_DISK storage level?
Answer options
- A. The MEMORY_ONLY storage level will store as much data as possible in memory and will store any data that does on fit in memory on disk and read it as it's called. The MEMORY_AND_DISK storage level will store as much data as possible in memory and will recompute any data that does not fit in memory as it’s called.
- B. The MEMORY_ONLY storage level will store as much data as possible in memory on two cluster nodes and will recompute any data that does not fit in memory as it’s called. The MEMORY_AND_DISK storage level will store as much data as possible in memory on two cluster nodes and will store any data that does on fit in memory on disk and read it as it's called.
- C. The MEMORY_ONLY storage level will store as much data as possible in memory on two cluster nodes and will store any data that does on fit in memory on disk and read it as it's called. The MEMORY_AND_DISK storage level will store as much data as possible in memory on two cluster nodes and will recompute any data that does not fit in memory as it's called.
- D. The MEMORY_ONLY storage level will store as much data as possible in memory and will recompute any data that does not fit in memory as it's called. The MEMORY_AND_DISK storage level will store as much data as possible in memory and will store any data that does on fit in memory on disk and read it as it's called.
- E. The MEMORY_ONLY storage level will store as much data as possible in memory and will recompute any data that does not fit in memory as it’s called. The MEMORY_AND_DISK storage level will store half of the data in memory and store half of the memory on disk. This provides quick preview and better logical plan design.
Correct answer: D
Explanation
Option D accurately describes that MEMORY_ONLY keeps data in memory and recomputes what doesn't fit, while MEMORY_AND_DISK also stores in memory but saves overflow data to disk for retrieval. Other options either incorrectly state the behavior of MEMORY_ONLY or MEMORY_AND_DISK or incorrectly describe how data is managed across cluster nodes.