Use Chaos Engineering to improve incident response | Eran Levy | Conf42 Incident Management 2022

Conference: Conf42 Incident Management 2022

Year: 2022

As engineers, we used to write code that was interacting with a well defined set of other applications. You usually had a set of services that were running in well defined environments. The evolution of cloud native technologies and the need to move fast, led organizations to redesign their structure. Engineers are now required to write services that are just one of many other services that usually solve a certain customer problem. Your services are smaller than what they used to be, they aren’t alone in a vacuum and you have to understand the problem space that your service is living in. These days engineers aren’t just writing code. They are expected to know how to deal with Kubernetes, HELM, containerize their service, ship to different environments and debug in a distributed cloud environment. In order to enhance engineers' cloud native knowledge and best practices to deal with production incidents, we started a series of workshops called: “On-Call like a king” which aims to enhance engineers knowledge while responding to production incidents. Every workshop is a set of chaos engineering experiments that simulate real production incidents and the engineers practice on investigating, resolving and finding the root cause. In this talk I will share how we got there, what we are doing and how it improves our engineering teams expertise. Other talks at this conference 🚀🪐 https://www.conf42.com/im2022 — 0:00 Intro 1:40 Talk