Journey to Resilience | Vilas Veeraraghavan | Conf42 Chaos Engineering 2020
Vilas Veeraraghavan Director of Engineering @ Walmart Labs Chaos engineering has come a long way from its early days at Netflix. Its importance is no longer questioned in the community but as it has gone mainstream, teams quickly learn that adoption is not a given. In this talk, we talk about the challenges that we encountered at Walmart and the techniques used to break through them. We will discuss our successes and failures on the journey to resilience, highlighting the major barriers to adoption. The talk also will discuss the strategies we used to build tooling to guide teams in addition to a gamified approach to motivate them. β 0:00 Preamble 0:30 Chaos Engineering 0:50 Walmart in numbers 1:46 Where de we start the journey? 3:52 Goal 4:19 Every Outage is a Chaos Exercise 5:11 Downtime is Ecpensive 7:10 The Homework 7:25 Observability 8:53 More On-Call Prerequisites 11:04 Generating production-like Load 12:29 CI/CI Workflow - invest in it 14:34 Build a Maturity Model 15:11 Support Costs 15:51 Build the Right Tools 16:34 Build the Right Mindset 18:02 What we learnt on our Journey? 18:19 Eliminate Vanity Positions 19:35 Don't Assume. Verify. 21:30 Are Teams ready for Exercises? 22:24 Where are we now? - Report card 24:45 Thank you! β π₯ Gold Sponsors: ChaosIQ PagerDuty β Website ππͺ https://www.conf42.com Reach out π§π mark@conf42.com Conf42 Discord π§βπ€βπ§π¬ https://discord.com/invite/dT6ZsFJ5ZM LinkedIn π¨βπΌπΌ https://www.linkedin.com/company/49110720/ Twitter π΅π¦https://twitter.com/conf42com Conf42Cast @ Spotify π§ https://tinyurl.com/bnyj6a8y