Google Cloud Professional Cloud DevOps Engineer — Question 74
You are responsible for the reliability of a high-volume enterprise application. A large number of users report that an important subset of the application's functionality `" a data intensive reporting feature `" is consistently failing with an HTTP 500 error. When you investigate your application's dashboards, you notice a strong correlation between the failures and a metric that represents the size of an internal queue used for generating reports. You trace the failures to a reporting backend that is experiencing high I/O wait times. You quickly fix the issue by resizing the backend's persistent disk (PD). How you need to create an availability
Service Level Indicator (SLI) for the report generation feature. How would you define it?
Answer options
- A. As the I/O wait times aggregated across all report generation backends
- B. As the proportion of report generation requests that result in a successful response
- C. As the application's report generation queue size compared to a known-good threshold
- D. As the reporting backend PD throughout capacity compared to a known-good threshold
Correct answer: B
Explanation
The correct answer is B because an SLI for availability should measure the success rate of requests, indicating how often the report generation feature works as intended. Options A, C, and D do not directly reflect the operational availability of the feature as they focus on performance metrics rather than successful outcomes.