Designing and Implementing Enterprise-Scale Analytics Using Microsoft Azure and Power BI — Question 4
You use an Apache Spark notebook in Azure Synapse Analytics to filter and transform data.
You need to review statistics for a DataFrame that includes:
The column name -
The column type -
The number of distinct values -
Whether the column has missing values
Which function should you use?
Answer options
- A. displayHTML()
- B. display(df, summary=true)
- C. %%configure
- D. display(df)
- E. %%lsmagic
Correct answer: B
Explanation
The function 'display(df, summary=true)' is specifically designed to provide a summary of the DataFrame, including column statistics such as names, types, distinct values, and missing values. The other options do not provide comprehensive statistical summaries; for example, 'display(df)' shows the data but lacks the summary details, while 'displayHTML()' is for rendering HTML content.