You are developing a training pipeline for a new XGBoost classification model based on ta…

Question

You are developing a training pipeline for a new XGBoost classification model based on tabular data. The data is stored in a BigQuery table. You need to complete the following steps: 1. Randomly split the data into training and evaluation datasets in a 65/35 ratio 2. Conduct feature engineering 3. Obtain metrics for the evaluation dataset 4. Compare models trained in different pipeline executions How should you execute these steps?

Accepted Answer

Correct answer: A. A. 1. Using Vertex AI Pipelines, add a component to divide the data into training and evaluation sets, and add another component for feature engineering.
2. Enable autologging of metrics in the training component.
3. Compare pipeline runs in Vertex AI Experiments. — Option A is correct because it outlines the appropriate use of Vertex AI Pipelines to separate the data and perform feature engineering, along with enabling autologging for metrics and comparing pipeline runs effectively. Options B and C, while viable, do not address the comparative analysis as effectively as A. Option D also lacks the necessary components for pipeline execution in the context described.

Google Cloud Professional Machine Learning Engineer — Question 334

Answer options

Correct answer: A

Explanation