You are viewing content from a past/completed QCon Plus - November 2020

Session

Change Data Capture for Distributed Databases @Netflix

At Netflix, we have hundreds of microservices that rely on hybrid backends ranging from RDS or NoSQL to ElasticSearch or Iceberg. This necessitates hybrid backends for distributed high throughput applications since no single database can handle all the access patterns. A classic example of this pattern is Apache Cassandra for robustness and resilience and Elastic Search for search capabilities. Keeping data in sync among these data stores is a challenging problem usually solved by individual teams building sync processes and audits. This adds complexity and operational overhead to the teams. Change Data Capture (CDC) provides an optimal solution for receiving all changes seen on a database. 

CDC events from NoSQL databases with Active-Active setups like Apache Cassandra have unique challenges due to data partitioning and replication. Current CDC solutions for this rely on running within the database cluster and providing a stream with duplicated events. Our solution takes a more efficient approach by de-duping the stream in a stream processing framework. This involves having a distributed copy of the source DB in a stream processing system like Apache Flink. This enables better handling of the CDC stream since we can have before and after images of row changes. 

In this talk, we will cover the challenges associated with capturing CDC events from Cassandra (C*) and how we efficiently provide a data stream of change events while maintaining minimal load on the C* cluster. We will discuss the Flink ecosystem and how the backing store of RocksDB is used to make sure the data stream has full row changes instead of differences between rows which is provided by C* CDC.


Speaker

Raghuram Onti Srinivasan

Senior Software Engineer @Netflix

Raghuram is a Senior Software Engineer at Netflix where he works on Data Integrations Platform enabling Change Data Capture and Google Suite integrations. He has contributed a Distributed locking mechanism to Dynomite, an open-source in-memory distributed datastore. He has previously worked on...

Read more

Speaker

Tharanga Gamaethige

Senior Software Engineer - Core Data Platform Team @Netflix

Tharanga is a Senior Software Engineer of the Core Data Platform team at Netflix where he works on building fault-tolerant and scalable storage systems to bring joy to Netflix viewers. He mainly works in EVCache, Netflix's distributed caching system where he contributes to open source...

Read more

From the same track

Session

Essential Complexity in Systems Architecture

Wednesday Nov 11 / 12:00PM EST

Simplicity is a virtue - simple systems should be easier to understand, easier to work on, easier to operate, and thus, more reliable. Everyone agrees on this in principle, to the extent that the ideal of promoting simplicity now appears in industry job ladder descriptions.    But what...

Laura Nolan

Senior Staff Engineer @Slack, Contributor to Seeking SRE, & SRECon Steering Committee

Session

Greenwater, Washington: An Availability Story

Wednesday Nov 11 / 12:50PM EST

System availability is a personal experience. If you’re having a bad time, it’s not much comfort if everybody else is doing fine. Unfortunately, the traditional ways - and even many of the modern ways - of thinking about availability don’t take this into consideration, and...

Marc Brooker

Senior Principal Engineer @AWS

PANEL DISCUSSION

Distributed Systems Panel

Wednesday Nov 11 / 02:30PM EST

View full Schedule

Less than

16

weeks until QCon Plus May 2022

Level-up on the emerging software trends and practices you need to know about.

Deep-dive with world-class software leaders at QCon Plus (Nov 1-12, 2021).

Save your spot for $549 before February 7th

Register