The DiRT on Chaos Engineering at Google • Jason Cahoon • GOTO 202132:54 423 views 94% Published 4 months ago
This presentation was recorded at GOTOpia February 2021. #GOTOcon #GOTOpia
Jason Cahoon - Site Reliability Engineer at Google
A shallow dive into 15 years of Chaos Engineering at Google, the lessons we've learned performing many thousands of disaster tests on production systems, and some tips on how to approach getting started with Chaos Engineering at your own [...]
01:02 DiRT: Disaster Resiliency Testing
04:38 What we test?
06:01 Testing themes
10:01 Practical vs theoretical
15:12 Picking what to test
16:29 Steps for bootstrapping a disaster testing program
18:25 Testing production vs testin in production
20:16 Really, you're breaking production though?!
23:00 Reporting on results
24:24 What have we learned?
26:55 Test example: Run at service level
28:51 Test example: Toggle the O-N / O-F-F discriminator
30:25 Test example: Run without dependencies
31:53 Test example: Hacked!
Download slides and read the full abstract here:
#ChaosEngineering #DiRT #Resilience #Observability #BuildingResilience #Resiliency #Programming #SRE #Programming #GameDay #DigitalTransformation #Reliability #RPC #RPCFaultInjection
Looking for a unique learning experience?
Attend the next GOTO conference near you! Get your ticket at https://gotopia.tech
SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.