Google Cloud Professional Machine Learning Engineer — Question 178
You work for an online grocery store. You recently developed a custom ML model that recommends a recipe when a user arrives at the website. You chose the machine type on the Vertex AI endpoint to optimize costs by using the queries per second (QPS) that the model can serve, and you deployed it on a single machine with 8 vCPUs and no accelerators.
A holiday season is approaching and you anticipate four times more traffic during this time than the typical daily traffic. You need to ensure that the model can scale efficiently to the increased demand. What should you do?
Answer options
- A. 1. Maintain the same machine type on the endpoint. 2. Set up a monitoring job and an alert for CPU usage. 3. If you receive an alert, add a compute node to the endpoint.
- B. 1. Change the machine type on the endpoint to have 32 vCPUs. 2. Set up a monitoring job and an alert for CPU usage. 3. If you receive an alert, scale the vCPUs further as needed.
- C. 1. Maintain the same machine type on the endpoint Configure the endpoint to enable autoscaling based on vCPU usage. 2. Set up a monitoring job and an alert for CPU usage. 3. If you receive an alert, investigate the cause.
- D. 1. Change the machine type on the endpoint to have a GPU. Configure the endpoint to enable autoscaling based on the GPU usage. 2. Set up a monitoring job and an alert for GPU usage. 3. If you receive an alert, investigate the cause.
Correct answer: C
Explanation
Option C is correct because it allows for autoscaling based on vCPU usage, which can help manage increased demand without changing the machine type. Options A and B do not take advantage of autoscaling, which is essential for handling significant traffic spikes. Option D focuses on GPU usage, which may not be necessary for this specific application and could incur additional costs.