APICS Certified Supply Chain Professional (CSCP) — Question 43
You encounter a large number of outages in the production systems you support. You receive alerts for all the outages that wake you up at night. The alerts are due to unhealthy systems that are automatically restarted within a minute. You want to set up a process that would prevent staff burnout while following Site
Reliability Engineering practices. What should you do?
Answer options
- A. Eliminate unactionable alerts.
- B. Create an incident report for each of the alerts.
- C. Distribute the alerts to engineers in different time zones.
- D. Redefine the related Service Level Objective so that the error budget is not exhausted.
Correct answer: A
Explanation
The correct answer is A because eliminating unactionable alerts reduces unnecessary disruptions and allows the team to focus on significant issues, thus aiding in preventing burnout. Options B and D involve increasing workload rather than addressing the core problem of alert fatigue, while option C does not solve the underlying issue of too many alerts, regardless of time zone distribution.