The Endgame of SRE

The containers are deployed and the builds are green. Yaml flows through the system, linted, reviewed, tested, and shipped with ease and regularity. Our intrepid SRE finds themself at a crossroads. The infrastructure is great but teams still struggle to maintain error budgets. These developers need help, and it doesn't seem like anyone else is coming to help them. We embark on a journey, to find out where resilience exists and how to make more. Join Amy on an epic quest into sociotechnical thinking, exploring ways SREs can impact reliability at scale beyond the bits and bytes that got us this far.


Amy Tobey

Senior Principal Engineer and SRE Practice Leader @Equinix

Amy Tobey has worked in tech for more than 20 years at companies of every size, working with everything from kernel code to user interfaces. These days she is senior principal engineer leading Applied Resilience Engineering at Equinix. When she's not working, she can be found with her nose in a book, watching anime with her son, making noise with electronics, or doing yoga in the sun.

Read more
Find Amy Tobey at:


Tuesday Dec 6 / 11:20AM PST ( 50 minutes )


SRE Reliability Containers YAML DevOps


From the same track

Session SRE

Did the Chaos Test Pass?

Tuesday Dec 6 / 10:10AM PST

People used to ask me all the time how to figure out if their chaos test has “passed,” and I’d always say “well, that’s a loaded question.” To confirm that a chaos test “passed,” we need to do verification of hypotheses - sometimes you’re trying to prove some system behavior occurred in response

Speaker image - Christina Yakomin
Christina Yakomin

Senior Site Reliability Engineering Specialist @Vanguard_Group

Session SRE

Rethinking Reliability: What You Can (and Can't) Learn From Incidents

Tuesday Dec 6 / 09:00AM PST

This talk presents research collected from the VOID—an open database of public incident reports. Containing over 2,000 reports for almost 700 organizations, the database allows for more structured review and research about software-related incident reporting.

Speaker image - Courtney Nash
Courtney Nash

Internet Incident Librarian & Senior Research Analyst @Verica

Session SRE

The Eternal Sunshine of the Toil-Less Prod

Tuesday Dec 6 / 12:30PM PST

One of the most important decisions in building an SRE practice is what kind of work should be assigned to the SRE team, and in what percentages.

Speaker image - Sasha Rosenbaum
Sasha Rosenbaum

Director of the Cloud Services Black Belt Team @RedHat