Google Cloud Professional Machine Learning Engineer — Question 139
You work on the data science team at a manufacturing company. You are reviewing the company’s historical sales data, which has hundreds of millions of records. For your exploratory data analysis, you need to calculate descriptive statistics such as mean, median, and mode; conduct complex statistical tests for hypothesis testing; and plot variations of the features over time. You want to use as much of the sales data as possible in your analyses while minimizing computational resources. What should you do?
Answer options
- A. Visualize the time plots in Google Data Studio. Import the dataset into Vertex Al Workbench user-managed notebooks. Use this data to calculate the descriptive statistics and run the statistical analyses.
- B. Spin up a Vertex Al Workbench user-managed notebooks instance and import the dataset. Use this data to create statistical and visual analyses.
- C. Use BigQuery to calculate the descriptive statistics. Use Vertex Al Workbench user-managed notebooks to visualize the time plots and run the statistical analyses.
- D. Use BigQuery to calculate the descriptive statistics, and use Google Data Studio to visualize the time plots. Use Vertex Al Workbench user-managed notebooks to run the statistical analyses.
Correct answer: C
Explanation
The correct answer is C because using BigQuery allows efficient handling of large datasets to calculate descriptive statistics quickly, while Vertex AI Workbench user-managed notebooks provide robust tools for visualizing time plots and conducting further statistical analyses. Options A and B are less efficient as they do not utilize BigQuery for the heavy lifting of data processing, and option D complicates the process by introducing Google Data Studio for visualizations instead of utilizing the capabilities of Vertex AI Workbench.