Designing and Implementing Enterprise-Scale Analytics Using Microsoft Azure and Power BI — Question 114
You are using a Python notebook in an Apache Spark pool in Azure Synapse Analytics.
You need to present the data distribution statistics from a DataFrame in a tabular view.
Which method should you invoke on the DataFrame?
Answer options
- A. summary
- B. cov
- C. sample
- D. rollup
Correct answer: A
Explanation
The correct method to call for presenting data distribution statistics in a tabular format is 'summary', as it provides an overview of the DataFrame's statistics. The other options, such as 'cov', focus on covariance, 'sample' is used for obtaining a random sample of the DataFrame, and 'rollup' is for aggregating data, which do not serve the purpose of displaying distribution statistics.