AI vision is solved. AI reasoning is not. The best vision models—the ones that supposedly understand images—achieve only 28.8% accuracy on tasks requiring physics, time, and causality. You'll trace the journey from 2015's Faster R-CNN breakthrough (56,700+ citations) through the evolution from messy multi-step pipelines to elegant end-to-end deep learning, only to discover the humbling reality: AI can classify objects brilliantly but can't reason about what it sees. Worse, there's a "reasoning illusion"—models get right answers through wrong processes. This episode shows you why the gap between perception and understanding matters.
Topics Covered
- Faster R-CNN: The breakthrough that gave AI eyes
- Region Proposal Networks explained simply
- The reasoning gap: classification ≠ understanding
- RiseBench: Testing temporal, causal, spatial, and logical reasoning
- World models for self-driving (Gaia 2)
- The "reasoning illusion": right answers, wrong process
- Process Verified Accuracy: checking the work, not just the answer