Raghuram Onti Srinivasan

Senior Software Engineer @Netflix

Raghuram is a Senior Software Engineer at Netflix where he works on Data Integrations Platform enabling Change Data Capture and Google Suite integrations. He has contributed a Distributed locking mechanism to Dynomite, an open-source in-memory distributed datastore. He has previously worked on real-time bidding systems and ad servers as part of digital ad platforms.

Presentation

Change Data Capture for Distributed Databases @Netflix

At Netflix, we have hundreds of microservices that rely on hybrid backends ranging from RDS or NoSQL to ElasticSearch or Iceberg. This necessitates hybrid backends for distributed high throughput applications since no single database can handle all the access patterns. A classic example of this pattern is Apache Cassandra for robustness and resilience and Elastic Search for search capabilities. Keeping data in sync among these data stores is a challenging problem usually solved by individual teams building sync processes and audits. This adds complexity and operational overhead to the teams. Change Data Capture (CDC) provides an optimal solution for receiving all changes seen on a database. 

CDC events from NoSQL databases with Active-Active setups like Apache Cassandra have unique challenges due to data partitioning and replication. Current CDC solutions for this rely on running within the database cluster and providing a stream with duplicated events. Our solution takes a more efficient approach by de-duping the stream in a stream processing framework. This involves having a distributed copy of the source DB in a stream processing system like Apache Flink. This enables better handling of the CDC stream since we can have before and after images of row changes. 

In this talk, we will cover the challenges associated with capturing CDC events from Cassandra (C*) and how we efficiently provide a data stream of change events while maintaining minimal load on the C* cluster. We will discuss the Flink ecosystem and how the backing store of RocksDB is used to make sure the data stream has full row changes instead of differences between rows which is provided by C* CDC.

DATE

Wednesday Nov 11 / 09:00AM PST (40 minutes )

TRACK Distributed Systems for Developers ADD TO CALENDAR Add to calendar SHARE

3 weeks of live software engineering content designed around your schedule.

Don’t miss out! Save your seat now

Register
TOP