List of videos

Who is responsible for Chaos? | Joyce Lin | Conf42 Chaos Engineering 2020

Joyce Lin Developer Advocate Lead @ Postman If you’re thinking of starting a chaos program, you might be wondering which job functions are typically responsible for managing chaos within their organizations. This talk will look across a number of companies to determine who historically initiates chaos programs, as well as reveal new trends in this space. — 0:00 Preamble 0:50 Who is responsible for chaos? 1:07 Which job titles are doing chaos? 1:23 By job title - diagram 1:53 Quote by Kolton Andrus (CEO @ Gremlin) 2:18 Resposibilities 4:14 Roles 4:35 Why aren't testers doing chaos? - Chaos Testing 5:00 Software Development Lifecycle 5:14 It was called chaos testing 5:45 Quote by Abby Bangser (PTE @ MOO) 6:29 Testers doing chaos?! 7:15 Who can start a chaos program? 8:22 Do I need to wait for a catastrophe 8:34 Quote by Casey Rosenthal (CEO @ Verica) 9:02 Final thoughts 11:08 Thank you! + Sources — 🥇 Gold Sponsors: ChaosIQ PagerDuty — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨‍💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y

Watch
Post Mortem Culture: Learning from Failure | Yury Niño Roa | Conf42 Chaos Engineering 2020

Yury Nino Roa DevOps Engineer @ Aval Digital Labs Practicing Chaos Engineering and reproducing outages have taught us that the culture of postmortems must be open and blameless. That is difficult, in part, due to the social stigma associated with publicly acknowledging the contributions of persons to outages. And although the scenarios simulated in a gameday are entirely realistic, it's hard to write-up postmortems that resume all events, hint human factors, recognize there is not a root cause and provide action items. In Aval Digital Labs, we are implementing a toolbox that automates the steps involved in chaos game days and generates postmortems using available in the market. — 0:00 About me 1:17 Have you written a Postmortem? 1:53 Agenda 2:29 What is a Postmortem? 3:53 If Postmortem are good, why don't we do it? 5:28 How to change a blameful culture? 5:43 Chaos Engineering 6:01 Chaos GameDays 7:12 What does it mean in practice? 9:00 Gaveta 11:50 Gaveta uses a hexagonal architecture 19:00 Promoting postmortem culture 19:35 Thank you! — 🥇 Gold Sponsors: ChaosIQ PagerDuty — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨‍💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y

Watch
Shipping Quality Software in Hostile Environments | Luka Kladaric | Conf42 Chaos Engineering 2020

Luka Kladaric Founder & Chaos Management @ Sekura Collective Everyone loves features, right? Product loves features. Management loves features. The board loves features. Features are what make the users use and the investors invest, right? They certainly make the media pay attention. What happens when, for 8 years straight, all you care about is features? Productivity grinds to a halt, production outages are a given, post-mortems are a joke and job satisfaction and happiness are flatlining. Lessons learned unravelling layers and layers of terribleness to rediscover productivity and job satisfaction while also improving security and robustness of the products. — 0:00 Who am I? 0:45 Hostile Environments? 1:08 Tech Debt? 2:30 Where does it come from? 4:55 What's the Harm? 7:09 Case Study 12:34 How do you even begin to fix this? 17:18 The Moral of this Story 18:48 How do we do Better? 20:22 Time for a New Approach 21:20 New Name: Sustainability Work 21:54 Budget VS Planning 22:51 Helps with Morale 22:59 "Sustainability Work" over "Tech Debt" — 🥇 Gold Sponsors: ChaosIQ PagerDuty — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨‍💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y

Watch
Chaos Engineering for SQL Server | Andrew Pruski | Conf42 Chaos Engineering 2020

Andrew Pruski SQL Server DBA @ Channel Advisor Slides: https://580d9e60-4356-475f-aa71-085d1b84e2cf.filesusr.com/ugd/f3e158_da0c445802b346f08f85bef1310bd909.pdf LinkedIn: https://www.linkedin.com/in/andrewpruski/ In this session we’ll look at how Chaos Engineering can be implemented with regard to SQL Server. SQL has various different high availability solutions but can we be sure that they’ll react as expected to a real world issue? Has the HA architecture only ever been tested in a planned maintenance window? We’ll explore SQL Server’s built-in high availability features and take a look at Kubernetes, a brand new platform for SQL Server. We’ll also have some fun by looking at KubeInvaders, a chaos engineering tool for Kubernetes…using Space Invaders! — 0:00 About me 0:45 Session Aim 1:47 Agenda 2:39 Identifying Weaknesses - Incident Analysis 4:04 Likelihood - Impact Map 12:10 Defining and Experiment 13:15 Running an Experiment 14:45 Demo 22:50 SQL Server running on Kubernetes 24:14 KubeInvaders 28:08 Resources — 🥇 Gold Sponsors: ChaosIQ PagerDuty — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨‍💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y

Watch
Getting out of the Starting Blocks | Adrian Hornsby | Conf42 Chaos Engineering 2020

Adrian Hornsby Principal Technologist, Architecture @ Amazon Web Services (AWS) LinkedIn: https://www.linkedin.com/in/hornsby/ Architectures are growing increasingly distributed and hard to understand. As a result, software systems have become extremely difficult to debug and test, which increases the risk of failure. With these new challenges, chaos engineering ha become attractive to many organizations as a mechanism for underling the behavior of systems under expected circumstances. Whilst interest is growing, few have managed to build sustainable chaos engineering practices. In this talk, I will review the state of chaos engineering, the issues customers are facing, based on my learning as an AWS Solution Architect and Technologist focusing on Chaos Engineering and explain why I started to build tools to help with failure injection. — 0:00 Preamble 1:55 What prevents the wide adoption of chaos engineering? 2:54 Why is production chaos? 3:45 #0 - (Don't) call it (Chaos) Engineering 4:47 #1 - Look at the Bigger Picture 13:20 #2 - Change begins with understanding 24:30 #3 - Choose your Trojan Horse 28:16 #4 - Over-index on the Hypothesis 32:15 #5 - Introduce Chaos Engineering Early in the Journey 35:31 #6 - Blast-Radius Reduction Mindset 36:23 #7 - If you haven't verified it, its probably Broken 39:43 https://github.com/adhorn 40:50 Getting out of the starting blocks 43:46 Thank you! — 🥇 Gold Sponsors: ChaosIQ PagerDuty — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨‍💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y

Watch
Cloud Native Chaos Engineering | Umasankar Mukkara | Conf42 Chaos Engineering 2020

Uma Mukkara Co-Founder and COO @ MayaData The cloud native approach has taken the DevOps world by a pleasant surprise by the welcome adoption of Kubernetes across all categories - from Developers to SREs to VP of digital transformation. As the huge mass of legacy applications move Cloud-Native platforms, an important problem arises. How do SREs make sure the systems do not have weaknesses and have the required level of resilience? A well thought out chaos engineering methodology is the right answer. And for a large number of fast-changing applications and infrastructure, finding the right set of chaos experiments and identifying if the impact of chaos has resulted in showing up a weakness in the system is almost an impossible task. In Cloud Native Chaos engineering, the developers develop chaos tests as an extension of the development process. These tests are developed using standard Kubernetes Custom Resources or CRs so that they are easier to manipulate according to the environment. These chaos experiments are groomed in CI pipelines and finally published in the Chaos Hub so that they are available to SREs using the Cloud-Native applications in production. SREs use such chaos experiments of various microservices to schedule chaos in a random fashion to find weaknesses in their deployments, which leads to increased reliability. — 0:00 Preamble 1:20 Chaos Eng is... 3:14 Agenda 4:28 Cloud Native Chaos Engineering 5:48 Cloun Native environment(s) 11:45 Cloud Native Chaos Engineering 13:31 Principles 12:23 Litmus project 21:44 Cloud Native Chaos Engineering - Example 22:36 ChaosHub 26:55 How can you contribute — 🥇 Gold Sponsors: ChaosIQ PagerDuty — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨‍💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y

Watch
Psychology of Chaos Engineering | Matty Stratton | Conf42 Chaos Engineering 2020

Matty Stratton DevOps Advocate @ PagerDuty Chaos Engineering, failure injection, and similar practices have verified benefits to the resilience of systems and infrastructure. But can they provide similar resilience to teams and people? What are the effects and impacts on the humans involved in the systems? This talk will delve into both positive and negative outcomes to all the groups of people involved - including users, engineers, product, and business owners. Using case studies from organizations where chaos engineering has been implemented, we will explore the changes in attitude that these practices create. This talk will include a brief overview of chaos engineering practices for unfamiliar members of the audience, but the main focus will be on human elements. I will discuss successful implementations, as well as challenges faced in teams where chaos was a “success” from a technical perspective, but contained negative impact for the people involved. — 0:00 Preamble 2:24 let's set some agreement 3:08 Chaos Engineering - definition 3:40 Chaos @ Netflix 4:57 perceptions 5:29 "isn't all engineering chaotic?" 6:19 it's not about breaking things 6:43 look, I know you know this 7:30 I'm gonna say it anyway 7:44 these are experiments 8:59 how we talk about things matter 12:00 people get nervous 12:31 You want to do 'what' in production?? 13:30 use your monitoring like it's for real because it is 15:43 but what about people? 16:21 what do you feel knowing Netflix uses chaos engineering? 16:42 what about your bank? 16:58 blast radius - twitter pole 16:24 data, such as it is 18:32 management can get... nervous - consider your words 19:34 it's all about philisophy 21:59 safety first 25:05 - https://speaking.mattstratton.com — 🥇 Gold Sponsors: ChaosIQ PagerDuty — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨‍💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y

Watch
Premiere - Conf42 Chaos Engineering 2021

🔵 Conf42 strikes back with Chaos Engineering 2021! Whole lineup: https://www.conf42.com/ce2021 Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM Enjoy this Premiere Show - an 🍽️ aperitif before digging into the 🍲 main dish - Keynote and Talks! We've got 31 videos on the menu! — Erratum: the quote "(...)a film should start with an earthquake (...)" is from Rupert Hart-Davis ('The Spectator", 1938), not Alfred Hitchcock. — 0:00 Sponsored Segment 2:26 Preamble 🍔🦈 Keynote 2:47 Mikolaj Pawlikowski ✨ Featured talk 3:19 Leonid Belkind 🔒 Security Track 3:53 Swapnil Deshmukh 4:35 Aaron Rinehart & David Lavezzo 5:11 Thibault Koechlin 5:37 Kennedy Torkura 6:04 Yury Nino Roa & Jhonnatan Gil Chaves 6:31 Romansh Yadav 🤿 Deep Dive Track 6:57 Tammy Bryant Butow 7:12 Paweł Skrzypek & Alicja Reniewicz 7:38 Lisa Karlin Curtis 8:01 Uma Mukkara 8:19 Long Zhang 8:48 Ryan Guest 9:15 Quintessence Anx 9:40 Mahesh Venkataraman 📝 Lessons Learned Track 9:56 Derris Boomer 10:13 Maik Figura & Oliver Kracht 10:38 Ana Margarita Medina 11:06 Bart Enkelaar 11:26 Joey Parsons 11:54 Reuben Rajan George 12:24 Quintin Balsdon 12:48 Piyush Verma 🎭 Culture 13:12 Karolina Rachwał 13:42 Julie Gunderson 13:57 Humaira Ahmed 14:22 Robert Ross 14:51 Prandaj Deo 15:17 Amir Shaked 15:40 Fabricio Buzeto 16:00 Thank you! Conf42Cast announcement — 🥇 Gold Sponsor: StackPulse 🥈 Silver Sponsors: Aval Digital Labs BoomerTechnologyGroup bxblue Certus Cybersecurity Chaos Native CrowdSec bol.com effx FireHydrant GoCardless Gremlin Locelle Nuaware PagerDuty Zühlke Group 7bulls.com 🤝 Media Partners: AWS Inside Dev Manning The New Stack — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨‍💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y

Watch
Chaos Engineering in 2021 | Mikolaj Pawlikowski | Conf42: Chaos Engineering

Mikolaj Pawlikowski Engineering Lead @ Bloomberg Author "Chaos Engineering: Site reliability through controlled disruption" — 🥇 Gold Sponsor: StackPulse 🥈 Silver Sponsors: Aval Digital Labs BoomerTechnologyGroup bxblue Certus Cybersecurity Chaos Native CrowdSec bol.com effx FireHydrant GoCardless Gremlin Locelle Nuaware PagerDuty Zühlke Group 7bulls.com 🤝 Media Partners: The New Stack Manning AWS Inside Dev — 0:00 Intro 0:27 Talk 1:11 Today's talk 1:58 Reliability 2:23 Remarkably Reliable 2:58 Resilience 3:32 2020 - test of resiliency 4:11 Trends 8:13 Code/Connectivity/Infrastructure/People 12:28 What is Chaos Engineering? - Definition 13:00 Testing 13:55 Chaos Experiment - ideas 14:48 4 steps to Chaos Experiment 16:50 Chaos Experiment - Outcomes 17:25 Innovation Adoption Livecycle 18:52 (Spoiler alert) Boring is good! 20:33 Tools 21:30 What's blocking you when doing Chaos? 21:55 Myths 29:00 Getting Chaos on the roadmap 30:25 Shark vs Burger 32:42 Book + discount code 35:10 Let's connect! 35:25 Thank you! — Website 🚀🪐 https://www.conf42.com Reach out 📧📭 mark@conf42.com Conf42 Discord 🧑‍🤝‍🧑💬 https://discord.com/invite/dT6ZsFJ5ZM LinkedIn 👨‍💼💼 https://www.linkedin.com/company/49110720/ Twitter 🎵🐦https://twitter.com/conf42com Conf42Cast @ Spotify 🎧 https://tinyurl.com/bnyj6a8y

Watch