Google Cloud Professional Machine Learning Engineer — Question 175

You work for a retail company. You have a managed tabular dataset in Vertex AI that contains sales data from three different stores. The dataset includes several features, such as store name and sale timestamp. You want to use the data to train a model that makes sales predictions for a new store that will open soon. You need to split the data between the training, validation, and test sets. What approach should you use to split the data?

Answer options

Correct answer: C

Explanation

The correct answer is C because using a chronological split allows the model to learn from past sales data based on the sales timestamp, which is crucial for making accurate predictions for future sales. Options A and D do not take into account the temporal aspect of the data, which can lead to less reliable predictions, while option B does not provide a tailored approach for the specific dataset being used.