Databricks Certified Associate Developer for Apache Spark — Question 31
Which of the following code blocks returns a 15 percent sample of rows from DataFrame storesDF without replacement?
Answer options
- A. storesDF.sample(fraction = 0.10)
- B. storesDF.sampleBy(fraction = 0.15)
- C. storesDF.sample(True, fraction = 0.10)
- D. storesDF.sample()
- E. storesDF.sample(fraction = 0.15)
Correct answer: E
Explanation
The correct answer is E because it specifies a fraction of 0.15 for sampling, which directly yields a 15 percent sample. Option A provides only 10 percent, B uses sampleBy which is not applicable for this context, C incorrectly includes replacement, and D does not specify a fraction, resulting in the default sample size.