The GeekNarrator

What makes Apache Pinot so Fast?


Listen Later

For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this episode, host Kaivalya Apte interviews Ankit Sultana, a staff engineer at Uber with extensive experience in Apache Pinot, a real-time analytics platform. They discuss the high-level architecture, ingestion processes, and query mechanisms of Apache Pinot. Ankit provides a historical context, detailing the evolution of Apache Pinot from its origins at LinkedIn to its widespread adoption. They discuss the key components of Pinot, explaining the roles of Pinot servers, brokers, controllers, and the dependency on Zookeeper. Ankit also explained how data flows into Apache Pinot and the technicalities of its real-time ingestion and querying capabilities. Chapters:00:00 Introduction and Episode Overview03:30 Understanding Apache Pinot03:49 Apache Pinot's Historical Background05:20 Real-Time Analytics with Apache Pinot11:06 Apache Pinot's Architecture and Components17:05 Tenancy and Data Ingestion in Apache Pinot30:22 Understanding Real-Time Replication and Consumer Groups30:52 Pinot's Offset Tracking and Segment Creation31:59 Handling Server Restarts and Segment Transitions32:50 Dealing with Kafka Duplicates and Deduplication Features35:13 Ingestion Process and Mutable vs Immutable Segments39:18 Memory Management and Segment Flushing40:10 Advantages of Keeping Mutable Segments Longer42:21 Introduction to Pinot's Query Engines42:50 Single Stage Engine: Architecture and Optimizations54:49 Multi-Stage Engine: Flexibility and Challenges58:13 Conclusion and Next StepsImportant Links:* Good high-level overview on Pinot: https://www.youtube.com/watch?v=F8Q_pGIH9yY* Apache Pinot 101 by Tim: https://www.youtube.com/playlist?list=PLihIrF0tCXdfN6y-twj9KtWaXM1GH4RSe* Multistage Physical Optimizer, the new optimizer that we built at Uber and open-sourced: https://docs.pinot.apache.org/users/user-guide-query/multi-stage-query/physical-optimizer* Multistage Lite Mode: https://docs.pinot.apache.org/users/user-guide-query/multi-stage-query/multistage-lite-mode* Time Series Engine Talk at RTA Summit: https://www.youtube.com/watch?v=kgseiambgesFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!

...more
View all episodesView all episodes
Download on the App Store

The GeekNarratorBy Kaivalya Apte

  • 5
  • 5
  • 5
  • 5
  • 5

5

3 ratings


More shows like The GeekNarrator

View all
Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

622 Listeners

TED Tech by TED Tech

TED Tech

396 Listeners

Up First from NPR by NPR

Up First from NPR

56,524 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

517 Listeners

The Daily Brief by Zerodha

The Daily Brief

14 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

64 Listeners