This episode follows six concrete changes in the agent stack: Ollama pushing deeper into local coding-agent runtimes, LM Studio improving Apple Silicon vision inference and remote local servers, NVIDIA positioning DGX Spark as a serious local-agent machine, EXO showing where distributed local inference still needs hardening, xAI shipping Grok Build while redirecting older model slugs to Grok 4.3, and LiteLLM plus Envoy AI Gateway tightening the routing layer that sits between agents and models.
Show notes: https://tobyonfitnesstech.com/podcasts/episode-52/