Google Cloud Professional Machine Learning Engineer — Question 326

You have developed a custom ML model using Vertex AI and want to deploy it for online serving. You need to optimize the model's serving performance by ensuring that the model can handle high throughput while minimizing latency. You want to use the simplest solution. What should you do?

Answer options

Correct answer: A

Explanation

The correct answer is A because deploying the model to a Vertex AI endpoint allows for automatic scaling based on demand, which optimizes both throughput and latency. Option B, while viable, introduces additional complexity with containerization that may not be necessary for the simplest solution. Options C and D focus on model improvement and analysis rather than directly addressing the immediate serving performance needs.