Google Cloud Professional Machine Learning Engineer — Question 69

You have deployed a model on Vertex AI for real-time inference. During an online prediction request, you get an “Out of Memory” error. What should you do?

Answer options

Correct answer: B

Explanation

The correct answer is B because reducing the batch size decreases the amount of memory required for processing, thus preventing the 'Out of Memory' error. Option A is incorrect as switching to batch mode may not resolve the immediate issue of memory overload. Option C does not address memory issues and instead focuses on data encoding. Option D is also not relevant since the error is related to memory usage, not the number of requests.