The video on-demand of this panel is available to logged in QCon attendees only. Please login to your QCon account to watch the panel.

PANEL DISCUSSION + Live Q&A

Panel: Observability and Understandability

This panel will feature experienced practitioners who have worked in the engineering teams of Google, Facebook, Dropbox. & MongoDB. They are all now working at startups focused on helping engineers improve their ability to reduce downtime and customer-impacting failures. Hear from this panel on their unique approaches and preferred methods to get impactful results - incident management, distributed tracing, and chaos engineering - we'll cover these topics in detail.

Speaker

Jason Yee

Director of Advocacy @Gremlin

Jason Yee is Director of Advocacy at Gremlin where he helps people build more resilient systems by learning from how they fail. He also leads the internal Chaos Engineering practices to make Gremlin more reliable. Previously, he worked at Datadog, O’Reilly Media, and MongoDB. His...

Speaker

John Egan

CEO and Co-Founder @Kintaba

John Egan is CEO and cofounder at Kintaba, the modern incident response and management product for teams. Prior to Kintaba, John helped to lead enterprise products at Facebook.

Speaker

Ben Sigelman

CEO and co-founder @LightStepHQ, Co-creator @OpenTracing API standard

Ben Sigelman is a Cofounder & CEO at Lightstep, a company that makes complex microservice applications more transparent and reliable. He is an expert in distributed tracing and also co-founded the OpenTelemetry project.

Speaker

Jason Yee

Director of Advocacy @Gremlin

Speaker

John Egan

CEO and Co-Founder @Kintaba

Speaker

Ben Sigelman

CEO and co-founder @LightStepHQ, Co-creator @OpenTracing API standard

From the same track

Session + Live Q&A Observability

Resources & Transactions: A Fundamental Duality in Observability

Tuesday May 18 / 12:00PM EDT

Fundamentally, there are only two types of “things worth observing” when it comes to production systems:Resources, andTransactionsThe tricky (and interesting) part is that they’re entirely codependent. “Transactions” are the things that traverse your system and...

Ben Sigelman

CEO and co-founder @LightStepHQ, Co-creator @OpenTracing API standard

Session + Live Q&A Incident Management

More More More! Why the Most Resilient Companies Want More Incidents

Tuesday May 18 / 10:00AM EDT

Major tech companies like Facebook, Google, and Netflix want more incidents, not fewer. NASA wants them so urgently that they import incidents from other companies. The reason? Postmortems. This talk will focus on how companies of any scale can improve their ingestion of understandability by...

John Egan

CEO and Co-Founder @Kintaba

Session + Live Q&A Observability

Observing and Understanding Failures: SRE Apprentices

Tuesday May 18 / 11:00AM EDT

In this session, Tammy will share how Padawans and Jedis can inspire and teach us how to help people of a wide variety of backgrounds, ages, and experience levels to observe and understand failures in production. Tammy will share how she and a colleague created an SRE Apprentice program to hire...

Tammy Bryant Butow

Principal Site Reliability Engineer @Gremlin

View full Schedule

PANEL DISCUSSION + Live Q&A

Panel: Observability and Understandability

Speaker

Jason Yee

Find Jason Yee at:

Speaker

John Egan

Find John Egan at:

Speaker

Ben Sigelman

Find Ben Sigelman at:

Speaker

Jason Yee

Speaker

John Egan

Speaker

Ben Sigelman

Date

Track

Topics

Add to Calendar

Share

From the same track

Resources & Transactions: A Fundamental Duality in Observability

Ben Sigelman

More More More! Why the Most Resilient Companies Want More Incidents

John Egan

Observing and Understanding Failures: SRE Apprentices

Tammy Bryant Butow