Google Cloud Professional Cloud DevOps Engineer — Question 65
You support a popular mobile game application deployed on Google Kubernetes Engine (GKE) across several Google Cloud regions. Each region has multiple
Kubernetes clusters. You receive a report that none of the users in a specific region can connect to the application. You want to resolve the incident while following Site Reliability Engineering practices. What should you do first?
Answer options
- A. Reroute the user traffic from the affected region to other regions that don't report issues.
- B. Use Stackdriver Monitoring to check for a spike in CPU or memory usage for the affected region.
- C. Add an extra node pool that consists of high memory and high CPU machine type instances to the cluster.
- D. Use Stackdriver Logging to filter on the clusters in the affected region, and inspect error messages in the logs.
Correct answer: A
Explanation
The correct answer is A because rerouting user traffic to unaffected regions immediately restores access for users while you investigate the issue. B is not appropriate as checking for resource spikes does not directly address user connectivity. C may be a long-term solution but does not provide an immediate fix, and D involves analyzing logs after the fact, which is not the first step to resolving user access issues.