Shirshanka Das

(He / him / his)

Founder of LinkedIn DataHub, Apache Gobblin, Acryl Data

Shirshanka is co-founder and CEO of Acryl Data, the company which is commercializing the open source DataHub project, a real-time metadata platform used by LinkedIn, Expedia, Saxo Bank, Klarna, Viasat, and many others.

Prior to founding Acryl, he was the overall architect for Big Data at LinkedIn from 2010 to 2020, and responsible for creating the metadata and data management strategy at the company. As part of this, he founded the DataHub project and shaped its evolution to a metadata platform that powers DataOps, MLOps, productivity, and governance use cases at LinkedIn. He is also a PMC and committer on the Apache Gobblin project which manages 100PB+ of data assets at rest at LinkedIn, and is deployed in production at other large companies like Verizon, PayPal etc. 

Prior to LinkedIn, Shirshanka worked on high-performance serving systems at Yahoo and PayPal. Shirshanka has a Ph.D. in Computer Science from UCLA.

Find Shirshanka Das at:

PANEL DISCUSSION

Managing Data at Scale

Since the advent of the internet, the need for reliable, low latency access to data has grown at a rapid pace. Data Infrastructure, which was once a single monolithic database, has evolved into a tapestry of point solutions tied together by data movement infrastructure (e.g. data replication streams). What was once the domain of DBAs is now accessed by engineers, analysts, ops, and often non-technical folks as well. A simple set of tables has become a complex latticework of data sets, streams, batch jobs, and the like. With this increase in complexity comes challenges and new concerns.

Some of the concerns we will tackle will be:

  • How do companies manage the ever-growing complexity in modern data ecosystems? 
  • How does data operations keep track of tens of thousands of daily job executions and particularly failures? 
  • How do the security, governance, and compliance folks ensure that the right people have access to the right data fields in order to preserve end-user privacy? 
  • What are the contracts between data producers & data consumers & how are they enforced?
    • How do data producers shield data consumers from breaking changes in schemas? 
    • How do data consumers find the data sets they need and how are they notified if those data sets are end-of-life’d?

Date

Monday Nov 8 / 02:10PM EST (40 minutes)

Track

Modern Data Architectures, Pipelines, & Streams

Topics

Data StreamsData EngineeringDatabase

Add to Calendar

Add to calendar

Share

Less than

23

weeks until QCon Plus May 2022

Level-up on the emerging software trends and practices you need to know about.

Deep-dive with world-class software leaders at QCon Plus (Nov 1-12, 2021).

Save your spot for $499 before January 10th

Register