AI Engineering Podcast

From GPUs to Workloads: Flex AI’s Blueprint for Fast, Cost‑Efficient AI


Listen Later

Summary
In this episode of the AI Engineering Podcast Brijesh Tripathi, CEO of Flex AI, talks about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC architecture at Intel and deploying supercomputers like Aurora, highlighting how access friction and idle infrastructure slow progress. He discusses Flex AI's innovative approach to simplifying heterogeneous compute, standardizing on consistent Kubernetes layers, and abstracting inference across various accelerators, allowing teams to iterate faster without wrestling with drivers, libraries, or cloud-by-cloud differences. Brijesh also shares insights into Flex AI's strategies for lifting utilization, protecting real-time workloads, and spanning the full lifecycle from fine-tuning to autoscaled inference, all while keeping complexity at bay.

Announcements
  • Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
  • When ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.
  • Your host is Tobias Macey and today I'm interviewing Brijesh Tripathi about FlexAI, a platform offering a service-oriented abstraction for AI workloads
Interview
  • Introduction
  • How did you get involved in machine learning?
  • Can you describe what FlexAI is and the story behind it?
  • What are some examples of the ways that infrastructure challenges contribute to friction in developing and operating AI applications?
    • How do those challenges contribute to issues when scaling new applications/businesses that are founded on AI?
  • There are numerous managed services and deployable operational elements for operationalizing AI systems. What are some of the main pitfalls that teams need to be aware of when determining how much of that infrastructure to own themselves?
  • Orchestration is a key element of managing the data and model lifecycles of these applications. How does your approach of "workload as a service" help to mitigate some of the complexities in the overall maintenance of that workload?
  • Can you describe the design and architecture of the FlexAI platform?
    • How has the implementation evolved from when you first started working on it?
  • For someone who is going to build on top of FlexAI, what are the primary interfaces and concepts that they need to be aware of?
  • Can you describe the workflow of going from problem to deployment for an AI workload using FlexAI?
  • One of the perennial challenges of making a well-integrated platform is that there are inevitably pre-existing workloads that don't map cleanly onto the assumptions of the vendor. What are the affordances and escape hatches that you have built in to allow partial/incremental adoption of your service?
  • What are the elements of AI workloads and applications that you are explicitly not trying to solve for?
  • What are the most interesting, innovative, or unexpected ways that you have seen FlexAI used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on FlexAI?
  • When is FlexAI the wrong choice?
  • What do you have planned for the future of FlexAI?
Contact Info
  • LinkedIn
Parting Question
  • From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
Links
  • Flex AI
  • Aurora Super Computer
  • CoreWeave
  • Kubernetes
  • CUDA
  • ROCm
  • Tensor Processing Unit (TPU)
  • PyTorch
  • Triton
  • Trainium
  • ASIC == Application Specific Integrated Circuit
  • SOC == System On a Chip
  • Loveable
  • FlexAI Blueprints
  • Tenstorrent
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
...more
View all episodesView all episodes
Download on the App Store

AI Engineering PodcastBy Tobias Macey

  • 4.3
  • 4.3
  • 4.3
  • 4.3
  • 4.3

4.3

6 ratings


More shows like AI Engineering Podcast

View all
The a16z Show by Andreessen Horowitz

The a16z Show

1,087 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

302 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

333 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

226 Listeners

DataFramed by DataCamp

DataFramed

269 Listeners

Practical AI by Practical AI LLC

Practical AI

211 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

95 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

501 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

131 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

227 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

610 Listeners

AI and I by Dan Shipper

AI and I

33 Listeners

AI + a16z by a16z

AI + a16z

35 Listeners

Lightcone Podcast by Y Combinator

Lightcone Podcast

21 Listeners

Training Data by Sequoia Capital

Training Data

39 Listeners