
Sign up to save your podcasts
Or


Enterprise AI spend on LLM APIs hit $8 billion in just the first half of 2025 — double all of 2024. Most of that spend is going to the same general-purpose models, and Iddo Gino thinks companies are building on a foundation that won't hold.
Iddo founded his first company at 17, scaled it to unicorn status by 24, and spent nearly a decade in the API integration space. His position is straightforward: trillion-parameter models optimized to do everything are expensive, slow, and ill-suited for the narrow, repeatable tasks that make up the majority of production AI workloads. The companies gaining ground are decomposing their systems into specialized models — each trained for one specific task, orders of magnitude smaller, and meaningfully more accurate than any general-purpose model in that lane. He also gets specific about why most fine-tuning efforts quietly fail, why model capability is no longer what's slowing agentic systems down, and what the market actually looks like by 2030 when this plays out.
Topics discussed:
$8B in enterprise LLM API spend in H1 2025 and what's actually driving it
Decomposing agentic systems into narrow subtasks vs. single general-purpose model approaches
Why fine-tuned models have a shelf life and the case for continuous weekly retraining cycles
The integration and data access layer as the real production bottleneck in agentic systems
MIT study: 90% enterprise AI initiative failure rate and what separates the 10% that work
Iddo's 2030 prediction: 50-60% of tokens flowing to specialized models, not large labs
Model agnosticism as a structural hedge against LLM provider lock-in
By Cadre AIEnterprise AI spend on LLM APIs hit $8 billion in just the first half of 2025 — double all of 2024. Most of that spend is going to the same general-purpose models, and Iddo Gino thinks companies are building on a foundation that won't hold.
Iddo founded his first company at 17, scaled it to unicorn status by 24, and spent nearly a decade in the API integration space. His position is straightforward: trillion-parameter models optimized to do everything are expensive, slow, and ill-suited for the narrow, repeatable tasks that make up the majority of production AI workloads. The companies gaining ground are decomposing their systems into specialized models — each trained for one specific task, orders of magnitude smaller, and meaningfully more accurate than any general-purpose model in that lane. He also gets specific about why most fine-tuning efforts quietly fail, why model capability is no longer what's slowing agentic systems down, and what the market actually looks like by 2030 when this plays out.
Topics discussed:
$8B in enterprise LLM API spend in H1 2025 and what's actually driving it
Decomposing agentic systems into narrow subtasks vs. single general-purpose model approaches
Why fine-tuned models have a shelf life and the case for continuous weekly retraining cycles
The integration and data access layer as the real production bottleneck in agentic systems
MIT study: 90% enterprise AI initiative failure rate and what separates the 10% that work
Iddo's 2030 prediction: 50-60% of tokens flowing to specialized models, not large labs
Model agnosticism as a structural hedge against LLM provider lock-in