Semi Doped

Nvidia "Acquires" Groq


Listen Later

Key Topics

  • What Nvidia actually bought from Groq and why it is not a traditional acquisition
  • Why the deal triggered claims that GPUs and HBM are obsolete
  • Architectural trade-offs between GPUs, TPUs, XPUs, and LPUs
  • SRAM vs HBM. Speed, capacity, cost, and supply chain realities
  • Groq LPU fundamentals: VLIW, compiler-scheduled execution, determinism, ultra-low latency
  • Why LPUs struggle with large models and where they excel instead
  • Practical use cases for hyper-low-latency inference:
    • Ad copy personalization at search latency budgets
    • Model routing and agent orchestration
    • Conversational interfaces and real-time translation
    • Robotics and physical AI at the edge
    • Potential applications in AI-RAN and telecom infrastructure
  • Memory as a design spectrum: SRAM-only, SRAM plus DDR, SRAM plus HBM
  • Nvidia’s growing portfolio approach to inference hardware rather than one-size-fits-all

Core Takeaways

  • GPUs are not dead. HBM is not dead.
  • LPUs solve a different problem: deterministic, ultra-low-latency inference for small models.
  • Large frontier models still require HBM-based systems.
  • Nvidia’s move expands its inference portfolio surface area rather than replacing GPUs.
  • The future of AI infrastructure is workload-specific optimization and TCO-driven deployment.


...more
View all episodesView all episodes
Download on the App Store

Semi DopedBy Vikram Sekar and Austin Lyons