Databricks Certified Associate Developer for Apache Spark — Question 137
Which of the following code blocks returns a 15 percent sample of rows from DataFrame storesDF without replacement?
Answer options
- A. storesDF.sample(True, fraction = 0.15)
- B. storesDF.sample(fraction = 0.15)
- C. storesDF.sampleBy(fraction = 0.15)
- D. storesDF.sample(fraction = 0.10)
- E. storesDF.sample()
Correct answer: B
Explanation
Option B is correct because it uses the sample method with the fraction parameter set to 0.15, which specifies the sample size without replacement. Option A includes a 'True' parameter indicating replacement, while option C uses 'sampleBy', which is not correct for this context. Options D and E do not provide the correct fraction for sampling.