You are viewing content from a past/completed QCon

Change Data Capture for Distributed Databases @Netflix

At Netflix, we have hundreds of microservices that rely on hybrid backends ranging from RDS or NoSQL to ElasticSearch or Iceberg. This necessitates hybrid backends for distributed high throughput applications since no single database can handle all the access patterns. A classic example of this pattern is Apache Cassandra for robustness and resilience and Elastic Search for search capabilities. Keeping data in sync among these data stores is a challenging problem usually solved by individual teams building sync processes and audits. This adds complexity and operational overhead to the teams. Change Data Capture (CDC) provides an optimal solution for receiving all changes seen on a database. 

CDC events from NoSQL databases with Active-Active setups like Apache Cassandra have unique challenges due to data partitioning and replication. Current CDC solutions for this rely on running within the database cluster and providing a stream with duplicated events. Our solution takes a more efficient approach by de-duping the stream in a stream processing framework. This involves having a distributed copy of the source DB in a stream processing system like Apache Flink. This enables better handling of the CDC stream since we can have before and after images of row changes. 

In this talk, we will cover the challenges associated with capturing CDC events from Cassandra (C*) and how we efficiently provide a data stream of change events while maintaining minimal load on the C* cluster. We will discuss the Flink ecosystem and how the backing store of RocksDB is used to make sure the data stream has full row changes instead of differences between rows which is provided by C* CDC.


Raghuram Onti Srinivasan

Senior Software Engineer @Netflix

Raghuram is a Senior Software Engineer at Netflix where he works on Data Integrations Platform enabling Change Data Capture and Google Suite integrations. He has contributed a Distributed locking mechanism to Dynomite, an open-source in-memory distributed datastore. He has previously worked on real-time bidding systems and ad servers as part of digital ad platforms.


Tharanga Gamaethige

Senior Software Engineer - Core Data Platform Team @Netflix

Tharanga is a Senior Software Engineer of the Core Data Platform team at Netflix where he works on building fault-tolerant and scalable storage systems to bring joy to Netflix viewers. He mainly works in EVCache, Netflix's distributed caching system where he contributes to open source Memcached. When EVCache operates smoothly serving billions of requests, he spends his spare time working on a system that can generate eventually consistent change streams from Cassandra. Many moons ago, he worked on a distributed analytical database as a database kernel engineer.

From the same track

View full Schedule

3 weeks of live software engineering content designed around your schedule.

Don’t miss out! Save your seat now