Databricks Certified Associate Developer for Apache Spark — Question 148
The code block shown below should return a 25 percent sample of rows from DataFrame storesDF with reproducible results. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.
Code block:
StoresDF.__1__(__2__ = __3__, __4__ = __5__)
Answer options
- A. 1. sample 2. fraction 3. 0.25 4. seed 5. True
- B. 1. sample 2. withReplacement 3. True 4. seed 5. True
- C. 1. sample 2. fraction 3. 0.25 4. seed 5. 1234
- D. 1. sample 2. fraction 3. 0.15 4. seed 5. 1234
- E. 1. sample 2. withReplacement 3. True 4. seed 5. 1234
Correct answer: C
Explanation
The correct answer is C because it uses 'fraction' set to 0.25 to specify a 25 percent sample and 'seed' set to 1234 to ensure reproducibility. Options A and D incorrectly use 'True' instead of a numeric seed and specify the wrong fraction, respectively, while options B and E use 'withReplacement', which is not applicable in this context.