You recently deployed a model to a Vertex AI endpoint and set up online serving in Vertex…

Question

You recently deployed a model to a Vertex AI endpoint and set up online serving in Vertex AI Feature Store. You have configured a daily batch ingestion job to update your featurestore. During the batch ingestion jobs, you discover that CPU utilization is high in your featurestore’s online serving nodes and that feature retrieval latency is high. You need to improve online serving performance during the daily batch ingestion. What should you do?

Accepted Answer

Correct answer: A. A. Schedule an increase in the number of online serving nodes in your featurestore prior to the batch ingestion jobs — Increasing the number of online serving nodes before the batch ingestion jobs (option A) allows for better handling of increased load, thereby improving performance. Autoscaling the online serving nodes (option B) could help, but it may not respond quickly enough to the immediate needs during batch ingestion. Autoscaling the prediction nodes of the DeployedModel (option C) does not directly address the online serving nodes' performance. Increasing the worker_count in the batch ingestion job (option D) helps with ingestion performance but does not alleviate the latency issues in online serving.

Google Cloud Professional Machine Learning Engineer — Question 251

Answer options

Correct answer: A

Explanation