Designing and Implementing Enterprise-Scale Analytics Using Microsoft Azure and Power BI — Question 4

You use an Apache Spark notebook in Azure Synapse Analytics to filter and transform data.
You need to review statistics for a DataFrame that includes:

The column name -

The column type -

The number of distinct values -
Whether the column has missing values
Which function should you use?

Answer options

Correct answer: B

Explanation

The function 'display(df, summary=true)' is specifically designed to provide a summary of the DataFrame, including column statistics such as names, types, distinct values, and missing values. The other options do not provide comprehensive statistical summaries; for example, 'display(df)' shows the data but lacks the summary details, while 'displayHTML()' is for rendering HTML content.