Monitorama PDX 2024 - Serverless Observability: a case study of SLOs

Conference: Monitorama PDX 2024

Year: 2024

Virginia Diana Todea's session from Monitorama PDX 2024. Fine tuning the Service Level Objectives is a golden rule for the industries adopting them. This talk explores the case study of migrating to Kubernetes and how SLOs and transforms can backfire if not managed correctly. The talk starts by presenting the reasons why SLOs are important in the software development framework. We will then present our case study on how migrating to Kubernetes exposed our system to SLO definition and the challenges our system faced when trying to measure and adhere to our SLOs. We will deep dive in the main bottlenecks we encountered when trying to apply our SLOs across our multi cluster infrastructure, how we dealt with these challenges by correlating our SLOs with burn rates and transforms. We highlight that the observability system can backfire as a consequence if the metrics are not properly adjusted. We wrap up our talk by drawing the main lessons from our case study and the benefits to the ecosystem: SLOs are paramount for the software development lifecycle and implementing them correctly should follow a process of continuous improvement alongside with adopting the right tools and metrics.