Tammy Bryant Butow

Principal Site Reliability Engineer @Gremlin

Tammy Butow is the principal SRE at Gremlin, where she works on Chaos Engineering, the facilitation of controlled experiments to identify systemic weaknesses. Gremlin helps engineers build resilient systems using their control plane and API. Tammy previously led SRE teams at Dropbox responsible for databases and storage systems used by over 500 million customers. Prior to this Tammy worked at DigitalOcean and at one of Australia’s largest banks in security engineering, product engineering, and infrastructure engineering. Tammy is the co-founder of Girl Geek Academy, a movement to teach 1 millon girls technical skills by 2025.

Find Tammy Bryant Butow at:

Interactive Session

Operational Excellence Panel

Being on call for a production system can be stressful whether it is your first time or you have been carrying a pager for years. When that alert goes off, will you be prepared? Will your system reliability mechanisms behave as intended? If not, are you able to debug and understand what’s going on? This roundtable pulls together software engineers and site reliability engineers with experience operating complex systems in production.

Topics are likely to include designing for operability, mitigation techniques, testing strategies, and lessons learned. As an audience member, you will also have the chance to ask the panel questions.

Date

Wednesday Nov 4 / 12:30PM PST (40 minutes)

Track

Architecting for Confidence: Building Resilient Systems

Add to Calendar

Add to calendar

Share

Session

Observing and Understanding Failures: SRE Apprentices

In this session, Tammy will share how Padawans and Jedis can inspire and teach us how to help people of a wide variety of backgrounds, ages, and experience levels to observe and understand failures in production. Tammy will share how she and a colleague created an SRE Apprentice program to hire and train new SREs who wanted a career change. Tammy will cover practical lessons learned, things she'd change and she'll also share how you can create and rollout a program for SRE Apprentices within your organization. Tammy will also share feedback from the SRE Apprentices themselves.  Is it difficult to observe and understand failures? Why is training from someone more experienced helpful? What are the hardest and easiest things to learn about observing and understanding failures as an SRE for 500 million+ users?

Date

Tuesday May 18 / 08:00AM PDT (40 minutes)

Track

Observability and Understandability in Production

Topics

ObservabilityDevopsIncident Management

Add to Calendar

Add to calendar

Share

Logo

Build your learning journey and level-up on the skills most in-demand in 2021. Attend QCon Plus (Nov 1-5, 2021).

Save your spot for $549 before August 31st

Register