Designing and Implementing Enterprise-Scale Analytics Using Microsoft Azure and Power BI — Question 125
You are using a Python notebook in an Apache Spark pool in Azure Synapse Analytics.
You need to present the data distribution statistics from a DataFrame in a tabular view.
Which method should you invoke on the DataFrame?
Answer options
- A. explain
- B. describe
- C. corr
- D. cov
Correct answer: B
Explanation
The correct method to use is 'describe', which provides summary statistics of the DataFrame, including count, mean, standard deviation, and quartiles. The 'explain' method is used to show the execution plan, while 'corr' and 'cov' are used for calculating correlation and covariance, respectively, rather than presenting distribution statistics.