Chaos Engineering in Action: Building Fault-Tolerant Systems | Hareesh Iyer | Conf42 SRE 2024

Read the abstract ➤ https://www.conf42.com/Site_Reliability_Engineering_SRE_2024_Hareesh_Iyer_chaos_engineering_practical Other sessions at this event ➤ https://www.conf42.com/sre2024 Support our mission ➤ https://www.conf42.com/support Join Discord ➤ https://discord.gg/DnyHgrC7jC Chapters 0:00 intro 0:26 preamble 0:53 agenda 1:15 why chaos engineering? 3:19 update about the october 4 outage 5:00 business impact of resilience is bigger than ever 5:19 why are these issues not surfaced during testing? 5:42 testing 7:19 what is chaos engineering? 10:03 testing vs experiments 11:37 chaos engineering: how to 13:09 #1: observe steady state 14:02 #2: plan hypothesis around the steady state 14:54 #3: run experiments 17:15 #4: verify and act 17:51 chaos engineering tools 18:59 thank you!