The Path to self-healing systems | Leonid Belkind | Conf42 Chaos Engineering 2021
Leonid Belkind CTO @ StackPulse Chaos Engineering is a powerful tool to test your resilience when the unexpected occurs. But you also need to build out the practice for what comes next - how you rectify issues caused by unexpected conditions. In this talk, Leonid Belkind, StackPulse CTO and Co-Founder, discusses how to use Chaos Engineering together with generic mitigations to deliver resilience that can't be accomplished with either alone. He'll share examples of how to fit these practices together, what's needed to get started, and how to use both together to begin building out self-healing systems. — 🥇 Gold Sponsor: StackPulse 🥈 Silver Sponsors: Aval Digital Labs BoomerTechnologyGroup bxblue Certus Cybersecurity Chaos Native CrowdSec bol.com effx FireHydrant GoCardless Gremlin Locelle Nuaware PagerDuty Zühlke Group 7bulls.com 🤝 Media Partners: The New Stack Manning AWS Inside Dev — 0:00 Intro 0:26 Preamble 0:50 Agenda 1:29 Chaos Engineering Manifesto 3:08 What does Chaos Engineering do for us? 4:27 TLDR; (Part I) 5:21 How much resilience do we want? 10:03 Proficiency in building Resilient systems 11:30 TLDR; (Part II) 12:09 Generic Mitigations - What are they 13:05 So wait... is this a bad Software Engineering? 13:51 Typical timeline of an outage 17:47 So, what kind of Generic Mitigations do you have in mind? 20:36 Is that it? Of course not 22:18 TLDR; (Part III) 23:04 1 + 1 = 3 (?) 26:17 Afterthought oand Action Items 29:58 A platform / methodology to combine both? 30:12 Thank you! — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑🤝🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y