O'Reilly Data Show Podcast

In the age of AI, fundamental value resides in data


Listen Later

In this episode of the Data Show, I spoke with Haoyuan Li, CEO and founder of Alluxio, a startup commercializing the open source project with the same name (full disclosure: I’m an advisor to Alluxio). Our discussion focuses on the state of Alluxio (the open source project that has roots in UC Berkeley’s AMPLab), specifically emerging use cases here and in China. Given the large-scale use in China, I also wanted to get Li’s take on the state of data and AI technologies in Beijing and other parts of China.
Here are some highlights from our conversation:
A much needed layer between compute and storage in a world with disparate storage systems
This new layer, which we call a virtual distributed file system, sits in the middle between the compute and storage layers. This new layer virtualizes data from different storage systems and presents a unified API with a global namespace for the data-driven applications to interact with all of the data in the enterprise environment.
AI and machine learning applications
One key reason people use an object store is that it is cheap. Per gigabyte or per terabyte, it’s cheaper than other solutions in a market,…but performance is not as good. And from that perspective, by putting open source Alluxio on top of that, that improves performance from Alluxio’s caching functionality. On top of that, in many cases, machine learning libraries cannot directly talk with object stores, and Alluxio can also serve as a translation layer.
Adoption in China
Things are moving very fast in that region. People are eager to adopt new technology, particularly for AI and big data. Some are users we know very quickly boosted their Alluxio deployments to hundreds of nodes or even thousands of nodes. It’s amazing to see how fast they can adapt.
… Of the top 10 internet companies in China, nine are using open source Alluxio in production today. All nine of them have big data and AI use cases for Alluxio. … I also travel back and forth between these two regions quite often, and every time I go there, I see more use cases, more applications, and more innovation.
Related resources:
Michael Franklin on the lasting legacy of AMPLab
Jason Dai on why “Companies in China are moving quickly to embrace AI technologies”
Kai-Fu Lee on “China: AI superpower”
Andrew Feldman on why “Specialized hardware for deep learning will unleash innovation”
Greg Diamos on “How big compute is powering the deep learning rocket ship”
Tim Kraska on “How machine learning will accelerate data management systems”
...more
View all episodesView all episodes
Download on the App Store

O'Reilly Data Show PodcastBy O'Reilly Media

  • 4
  • 4
  • 4
  • 4
  • 4

4

63 ratings


More shows like O'Reilly Data Show Podcast

View all
The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

283 Listeners

O'Reilly Radar Podcast - O'Reilly Media Podcast by O'Reilly Media

O'Reilly Radar Podcast - O'Reilly Media Podcast

36 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

482 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

592 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

623 Listeners

O'Reilly Design Podcast - O'Reilly Media Podcast by O'Reilly Media

O'Reilly Design Podcast - O'Reilly Media Podcast

8 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

446 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

202 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

297 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

323 Listeners

Machine Learning Guide by OCDevel

Machine Learning Guide

764 Listeners

AI Today Podcast by AI & Data Today

AI Today Podcast

146 Listeners

DataFramed by DataCamp

DataFramed

267 Listeners

Practical AI by Practical AI LLC

Practical AI

192 Listeners

Google DeepMind: The Podcast by Hannah Fry

Google DeepMind: The Podcast

197 Listeners

Last Week in AI by Skynet Today

Last Week in AI

287 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

199 Listeners