
Sign up to save your podcasts
Or
In this podcast I have invited Neha Pawar, who is one of the Founding Engineers are StarTree (the company powering Apache Pinot). We talked about how StarTree has implemented Tiered storage and how it differs from other available implementations. Note: Currently tiered storage is available only in StarTree’s Pinot and not available in the open source version. But its only about time.
Chapters: 00:00 Introduction 03:28 What does Tiered Storage mean? 05:51 How many tiers are typically supported? 07:30 Is it mainly about Cost Optimisation? How do I compare the cost savings vs performance hit? 15:41 What is mmap and how does it help? 16:45 How do I implement/approach Tiered Storage? What are the challenges? 23:00 What is Apache Pinot? When we say low latency, how low it is? 25:00 How is it implemented in StarTree (Apache Pinot)? 36:45 What happens when I query for more number of (or all) columns? How is that optimised? 47:10 What are the failure modes? 50:15 How can we test and validate Tiered Storage as a feature? 54:30 How would bloom filter false positives affect performance and correctness? 56:15 Can I move back my data from Cold storage to Hot Storage? 57:45 What other cloud storage services are supported other than S3? 58:35 What is the future of Tiered Storage?
5
33 ratings
In this podcast I have invited Neha Pawar, who is one of the Founding Engineers are StarTree (the company powering Apache Pinot). We talked about how StarTree has implemented Tiered storage and how it differs from other available implementations. Note: Currently tiered storage is available only in StarTree’s Pinot and not available in the open source version. But its only about time.
Chapters: 00:00 Introduction 03:28 What does Tiered Storage mean? 05:51 How many tiers are typically supported? 07:30 Is it mainly about Cost Optimisation? How do I compare the cost savings vs performance hit? 15:41 What is mmap and how does it help? 16:45 How do I implement/approach Tiered Storage? What are the challenges? 23:00 What is Apache Pinot? When we say low latency, how low it is? 25:00 How is it implemented in StarTree (Apache Pinot)? 36:45 What happens when I query for more number of (or all) columns? How is that optimised? 47:10 What are the failure modes? 50:15 How can we test and validate Tiered Storage as a feature? 54:30 How would bloom filter false positives affect performance and correctness? 56:15 Can I move back my data from Cold storage to Hot Storage? 57:45 What other cloud storage services are supported other than S3? 58:35 What is the future of Tiered Storage?
272 Listeners
284 Listeners
40 Listeners
590 Listeners
621 Listeners
1,784 Listeners
140 Listeners
192 Listeners
62 Listeners
139 Listeners
408 Listeners
47 Listeners
461 Listeners
371 Listeners
63 Listeners