The AI Forecast: Data and AI in the Cloud Era

Open Lakehouse Architecture: How to Scale AI to Production


Listen Later

Open lakehouse architecture is becoming the foundation for production AI and enterprise AI at scale. 

In this episode of The AI Forecast, Dipankar Mazumdar, Director of Developer Relations at Cloudera and co-author of the book “Engineering Lakehouse with Open Table Formats,” joins Paul Muller to explain why open lakehouse architecture is critical for moving from AI pilot to production AI. 

They break down: 

  • How Apache Iceberg and open table formats decouple storage from compute
  • How schema evolution enables change without costly data rewrites
  • How multiple engines can securely access the same data without duplication
  • How to prevent small-file performance bottlenecks
  • How to control AI compute costs at scale
  • How to embed governance, metadata, and data lineage into AI workloads 
  • Production-ready AI requires scalable data architecture and governance built in from day one. AI and GenAI pilots may be everywhere, but your architecture is what truly decides what survives.  

    Stay in touch with Dipankar:  

    • Dipankar Mazumdar on LinkedIn: https://www.linkedin.com/in/dipankar-mazumdar/ 
      • Dipankar’s website: https://dipankarmazumdar.github.io/ 
        • Dipankar’s book on Amazon: https://www.amazon.com/Engineering-Lakehouses-Open-Table-Formats-ebook/dp/B0DKJD39X8 
        •  

          +++ 

          Follow and subscribe to The AI Forecast for more conversations with the innovators shaping the future of enterprise AI. 

          ...more
          View all episodesView all episodes
          Download on the App Store

          The AI Forecast: Data and AI in the Cloud EraBy Cloudera