AI Engineering Podcast

Optimize Your AI Applications Automatically With The TensorZero LLM Gateway


Listen Later

Summary
In this episode of the AI Engineering podcast Viraj Mehta, CTO and co-founder of TensorZero, talks about the use of LLM gateways for managing interactions between client-side applications and various AI models. He highlights the benefits of using such a gateway, including standardized communication, credential management, and potential features like request-response caching and audit logging. The conversation also explores TensorZero's architecture and functionality in optimizing AI applications by managing structured data inputs and outputs, as well as the challenges and opportunities in automating prompt generation and maintaining interaction history for optimization purposes.


Announcements
  • Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
  • Seamless data integration into AI applications often falls short, leading many to adopt RAG methods, which come with high costs, complexity, and limited scalability. Cognee offers a better solution with its open-source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cognee enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data in LLM apps without unnecessary overhead. Visit aiengineeringpodcast.com/cognee to learn more and elevate your AI apps and agents. 
  • Your host is Tobias Macey and today I'm interviewing Viraj Mehta about the purpose of an LLM gateway and his work on TensorZero
Interview
  • Introduction
  • How did you get involved in machine learning?
  • What is an LLM gateway?
    • What purpose does it serve in an AI application architecture?
  • What are some of the different features and capabilities that an LLM gateway might be expected to provide?
  • Can you describe what TensorZero is and the story behind it?
    • What are the core problems that you are trying to address with Tensor0 and for whom?
  • One of the core features that you are offering is management of interaction history. How does this compare to the "memory" functionality offered by e.g. LangChain, Cognee, Mem0, etc.?
  • How does the presence of TensorZero in an application architecture change the ways that an AI engineer might approach the logic and control flows in a chat-based or agent-oriented project?
  • Can you describe the workflow of building with Tensor0 and some specific examples of how it feeds back into the performance/behavior of an LLM?
  • What are some of the ways in which the addition of Tensor0 or another LLM gateway might have a negative effect on the design or operation of an AI application?
  • What are the most interesting, innovative, or unexpected ways that you have seen TensorZero used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on TensorZero?
  • When is TensorZero the wrong choice?
  • What do you have planned for the future of TensorZero?
Contact Info
  • LinkedIn
Parting Question
  • From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
Closing Announcements
  • Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
Links
  • TensorZero
  • LLM Gateway
  • LiteLLM
  • OpenAI
  • Google Vertex
  • Anthropic
  • Reinforcement Learning
  • Tokamak Reactor
  • Viraj RLHF Paper
  • Contextual Dueling Bandits
  • Direct Preference Optimization
  • Partially Observable Markov Decision Process
  • DSPy
  • PyTorch
  • Cognee
  • Mem0
  • LangGraph
  • Douglas Hofstadter
  • OpenAI Gym
  • OpenAI o1
  • OpenAI o3
  • Chain Of Thought
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
...more
View all episodesView all episodes
Download on the App Store

AI Engineering PodcastBy Tobias Macey

  • 4.3
  • 4.3
  • 4.3
  • 4.3
  • 4.3

4.3

6 ratings


More shows like AI Engineering Podcast

View all
The Cloudcast by Massive Studios

The Cloudcast

153 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

994 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

629 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

296 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

322 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

139 Listeners

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion by AI & Data Today

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion

144 Listeners

Practical AI by Practical AI LLC

Practical AI

189 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

Last Week in AI by Skynet Today

Last Week in AI

281 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

88 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

124 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

63 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

423 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners