Google Cloud Professional Cloud DevOps Engineer — Question 79
Your company follows Site Reliability Engineering principles. You are writing a postmortem for an incident, triggered by a software change that severely affected users. You want to prevent severe incident from happening in the future. What should you do?
Answer options
- A. Identify engineers responsible for the incident and escalate to the senior management.
- B. Ensure that test cases that catch errors of this type are run successfully before new software releases.
- C. Follow up with the employees who reviewed the changes and prescribe practices they should follow in the future.
- D. Design a policy that will require on-call teams to immediately call engineers and management to discuss a plan of action if an incident occurs.
Correct answer: B
Explanation
The correct answer, B, emphasizes the importance of implementing thorough testing procedures to catch potential errors before software is released, thereby preventing future incidents. Options A and C focus on accountability and employee practices, which do not directly address preventing the issue at its source. Option D, while promoting immediate communication during incidents, does not contribute to reducing the likelihood of incidents occurring based on software changes.