EP042 starts with OpenClaw v2026.4.26: browser realtime transport contracts, constrained Google Live tokens, Gateway relay sessions, bundled Cerebras provider support, manifest-owned provider routing metadata, asymmetric embedding input types, retrieval prefixes for local embedding models, safer plugin mutation, Matrix encryption setup, transcript compaction, and migration tooling. Then we go deeper than prior episodes on inference infrastructure: Groq’s LPU-backed hosted inference, Cerebras wafer-scale inference, LM Studio’s local desktop/server stack, Ollama’s local runner and cloud tiers, OpenRouter’s multi-provider marketplace, LiteLLM’s self-hostable gateway role, and cost-per-value ratings for each. We close with OpenAI Privacy Filter as a local PII token-classifier and Google Cloud AI zones as accelerator-placement infrastructure.
Show notes: https://tobyonfitnesstech.com/podcasts/episode-42/