Google Cloud Professional Machine Learning Engineer — Question 316
You work at an organization that manages a popular payment app. You built a fraudulent transaction detection model by using scikit-learn and deployed it to a Vertex AI endpoint. The endpoint is currently using 1 e2-standard-2 machine with 2 vCPUs and 8 GB of memory. You discover that traffic on the gateway fluctuates to four times more than the endpoint's capacity. You need to address this issue by using the most cost-effective approach. What should you do?
Answer options
- A. Re-deploy the model with a TPU accelerator.
- B. Change the machine type to e2-highcpu-32 with 32 vCPUs and 32 GB of memory.
- C. Set up a monitoring job and an alert for CPU usage. If you receive an alert, scale the vCPUs as needed.
- D. Increase the number of maximum replicas to 6 nodes, each with 1 e2-standard-2 machine.
Correct answer: D
Explanation
The correct answer is D because increasing the number of replicas allows for better handling of fluctuating traffic without significantly increasing costs. Options A and B involve changing the machine type, which might not be as cost-effective as simply scaling out. Option C would only react to CPU usage alerts rather than proactively managing traffic load, which is not the most efficient approach.