All organizations, regardless of size, need to be able to make rapid changes and improvements in their constantly growing systems. How can we handle all this change while maintaining a reliable product?
In 2018, Lyft operated a few hundred services. Deploying a change was difficult: a developer had to take a lock, read a runbook, then execute several manual steps, all while monitoring for potential problems. This took time, resulting in large, delayed, deploy trains, and ultimately, reliability issues. Today, Lyft operates over 1,000 services, and, by adopting continuous delivery, more than 90% are now automatically deployed to production, with no manual intervention. This has significantly improved reliability, freed up developer time, and sped up our ability to ship changes.
I will share the details on our journey to continuous delivery, the benefits, challenges, and lessons we learned along the way:
- The benefits, obvious and maybe non-obvious, of continuous delivery.
- How to set up your organization’s deploy culture to successfully adopt continuous delivery.
- How to design a pluggable system of checks to automatically detect issues in deployments before they become widespread.
- How we measured to ensure we improved reliability and developer productivity through continuous delivery.
Interview:
What is the focus of your work these days?
I've been at Lyft for almost four years now, and my day-to-day responsibilities since I joined were overseeing the infrastructure, more on the production aspect of it, more about the side after the code has been unit tested and built and then ready to go to production. Mainly focused a lot on deployment as you'll see in this talk. But now I oversee more than just the deployment aspect, but also the networking, how the processes work, how we manage our Kubernetes clusters, things like that.
And what's the motivation for your talk?
I feel that the topic of continuous deployment or continuous delivery, this is the apex of automation, right? The idea seems a little scary, but it also seems fantastic. It almost seems like a fantasy world that no one can really get to. You can ship code change at any time and it can safely go to production. You could even deploy on a Friday. That just sounds like a fantasy world. And I think that the motivation here is we haven't really gotten to that. True fantasy world? We're not having unicorns and stuff floating around. But we're getting really close. And the purpose of this talk is to show people that it's not really an impossible goal. And it's only possible, especially if you cut the right corners that are appropriate for your organization and things like that. So, yeah, I'm here to tell you that it is possible.
How would you describe the persona and level of your target audience?
The persona for this talk is someone who either is in a decision making position or wants to be in a decision making position to create a very large cultural shift in your organization. If you adopt continuous deployment or continuous delivery, it tends to change the shipping culture of the entire organization. And it's something that we've seen and something that I'll mention in this talk as well. So if you want to be someone who can usher in that big phase change in your organization, I think that's the type of person this is for.
You've touched on this a little bit, but what is it that you would like the people that go to your session to walk away with?
What I said earlier is that this is possible. It can seem like an impossible task. I know some larger organizations definitely have some aspects of continuous deployment, but for organizations that are in the teenager phase where one part of the organization has grown, but the other one hasn't grown appropriately with that. And therefore, maybe your infrastructure team's a little weaker than the rest. Even in those cases, you can build a tool to make it possible to do continuous deployment because we've been in that phase where we're just growing unevenly and even in that situation we've been able to ship something that has helped tremendously with our operations.
Speaker
Tom Wanielista
Senior Staff Software Engineer @Lyft
Tom Wanielista is part of the Infrastructure team at Lyft, where he has focused on improving reliability in production by speeding up the deployment feedback loop. Prior to Lyft, Tom worked on Infrastructure in the Fintech space, where he was responsible for building tools to allow developers to safely deploy changes while keeping the stack secure & compliant. Tom studied at New York University where he received a BA in Computer Science.