Chinmay Soman

PMC Member/Commiter @SamzaStream

Chinmay Soman has been working in the distributed systems domain for the past 10+ years. He started out in IBM where he worked on distributed file systems and replication technologies. He then joined the Data Infrastructure team in LinkedIn and worked on open source technologies such as Voldemort and Apache Samza. Until recently, he was a Senior Staff Software Engineer in Uber where he led the Streaming and Real-Time Analytics Platform team. Currently, he’s a founding engineer in a stealth mode company.

Find Chinmay Soman at:

Session

Building Latency Sensitive User Facing Analytics via Apache Pinot

Real-time analytics has become the need of the hour for modern Internet companies. The ability to derive internal insights around business metrics, user growth & adoption as well as security incidents from all the raw logs is crucial for day to day operation. Even more critical is enabling access to usage analytics for the millions of customers which is non-trivial to achieve.

A good example of this is LinkedIn’s ‘Who Viewed My Profile’ which allows all 700 million+ users to slice and dice their page view data. Another example is Uber’s Restaurant Manager which enables restaurant owners across the globe to gain insights around menu preference, sales metrics, busy hours and so on. All such user facing applications need an analytical store that can support 1000s of queries per second at a millisecond response time granularity while ingesting millions of events/second.

In this talk, we will elaborate on how this is made possible using Apache Pinot - a popular, open source, distributed OLAP store. Specifically, we will talk about how to maintain the p99th latency SLA in the presence of organic data growth and concurrent queries.

Date

Tuesday Nov 17 / 02:40PM EST (40 minutes)

Track

Modern Data Engineering

Add to Calendar

Add to calendar

Share

PANEL DISCUSSION

Modern Data Engineering Panel

Data Engineering is a vast field that concerns itself with efficient access to data based on the needs of a business. Though data is the prized entity from which a company extracts insights, data doesn't exist in a void. It first needs to be stored somewhere and then an API needs to be provided such that a client can access this data. Therein lie the opportunity and the challenge. We have seen an explosion in technologies in the field of data engineering (OLTP DBs, OLAP DBs, Data Streams, Big Data Processing, Search Engines, Graph Processing and Graph Serving Engines, Caches, Block and Object Stores, etc.... ). When you consider the myriad ways these puzzle pieces can be put together to build a modern data engineering stack, you soon find that no 2 stacks resemble one another. 

What does a data engineer need to know in order to be successful in today's world? What are some best practices, pitfalls, and ways to think about building a low cost-of-ownership, high-quality data platform? What technologies are non-starters and why? What technologies are hidden gems? Finally, what should the industry think about and what is coming next? Join our panel of experts as we explore these questions in order to shed light on these areas. 
 

Date

Tuesday Nov 17 / 03:30PM EST (40 minutes)

Track

Modern Data Engineering

Add to Calendar

Add to calendar

Share

Less than

16

weeks until QCon Plus May 2022

Level-up on the emerging software trends and practices you need to know about.

Deep-dive with world-class software leaders at QCon Plus (Nov 1-12, 2021).

Save your spot for $549 before February 7th

Register