
Sign up to save your podcasts
Or


This episode is sponsored by tastytrade.
Trade stocks, options, futures, and crypto in one platform with low commissions and zero commission on stocks and crypto. Built for traders who think in probabilities, tastytrade offers advanced analytics, risk tools, and an AI-powered Search feature.
Learn more at https://tastytrade.com/
Voice AI is moving far beyond transcription.
In this episode, Carter Huffman, CTO and co-founder of Modulate, explains how real-time voice intelligence is unlocking something much bigger than speech-to-text. His team built AI that understands emotion, intent, deception, harassment, and fraud directly from live conversations. Not after the fact. Instantly.
Carter shares how their technology powers ToxMod to moderate toxic behavior in online games at massive scale, analyzes millions of audio streams with ultra-low latency, and beats foundation models using an ensemble architecture that is faster, cheaper, and more accurate. We also explore voice deepfake detection, scam prevention, sentiment analysis for finance, and why voice might become the most important signal layer in AI.
If you're building voice agents, working on AI safety, or curious where conversational AI is heading next, this conversation breaks down the technical and practical future of voice understanding.
Stay Updated:
Craig Smith on X: https://x.com/craigss
Eye on A.I. on X: https://x.com/EyeOn_AI (00:00) Real-Time Voice AI: Detecting Emotion, Intent & Lies
(03:07) From MIT & NASA to Building Modulate
(04:45) Why Voice AI Is More Than Just Transcription
(06:14) The Toxic Gaming Problem That Sparked ToxMod
(12:37) Inside the Tech: How "Ensemble Models" Beat Foundation Models
(21:09) Achieving Ultra-Low Latency & Real-Time Performance
(26:16) From Voice Skins to Fighting Harassment at Scale
(37:31) Beyond Gaming: Fraud, Deepfakes & Voice Security
(46:14) Privacy, Ethics & Voice Fingerprinting Risks
(52:10) Lie Detection, Sentiment & Finance Use Cases
(54:57) Opening the API: The Future of Voice Intelligence
By Craig S. Smith4.7
5555 ratings
This episode is sponsored by tastytrade.
Trade stocks, options, futures, and crypto in one platform with low commissions and zero commission on stocks and crypto. Built for traders who think in probabilities, tastytrade offers advanced analytics, risk tools, and an AI-powered Search feature.
Learn more at https://tastytrade.com/
Voice AI is moving far beyond transcription.
In this episode, Carter Huffman, CTO and co-founder of Modulate, explains how real-time voice intelligence is unlocking something much bigger than speech-to-text. His team built AI that understands emotion, intent, deception, harassment, and fraud directly from live conversations. Not after the fact. Instantly.
Carter shares how their technology powers ToxMod to moderate toxic behavior in online games at massive scale, analyzes millions of audio streams with ultra-low latency, and beats foundation models using an ensemble architecture that is faster, cheaper, and more accurate. We also explore voice deepfake detection, scam prevention, sentiment analysis for finance, and why voice might become the most important signal layer in AI.
If you're building voice agents, working on AI safety, or curious where conversational AI is heading next, this conversation breaks down the technical and practical future of voice understanding.
Stay Updated:
Craig Smith on X: https://x.com/craigss
Eye on A.I. on X: https://x.com/EyeOn_AI (00:00) Real-Time Voice AI: Detecting Emotion, Intent & Lies
(03:07) From MIT & NASA to Building Modulate
(04:45) Why Voice AI Is More Than Just Transcription
(06:14) The Toxic Gaming Problem That Sparked ToxMod
(12:37) Inside the Tech: How "Ensemble Models" Beat Foundation Models
(21:09) Achieving Ultra-Low Latency & Real-Time Performance
(26:16) From Voice Skins to Fighting Harassment at Scale
(37:31) Beyond Gaming: Fraud, Deepfakes & Voice Security
(46:14) Privacy, Ethics & Voice Fingerprinting Risks
(52:10) Lie Detection, Sentiment & Finance Use Cases
(54:57) Opening the API: The Future of Voice Intelligence

478 Listeners

170 Listeners

346 Listeners

161 Listeners

215 Listeners

99 Listeners

141 Listeners

100 Listeners

162 Listeners

228 Listeners

675 Listeners

281 Listeners

25 Listeners

32 Listeners

40 Listeners