Domesticating AI

Hardware-First Home AI: Chips, Memory, Backends, and What to Buy


Listen Later

Episode 3 is a hardware-first guide to running AI at home. We break down what CPUs vs GPUs vs NPUs vs TPUs actually do in the inference pipeline, why memory capacity isn’t the same as performance (model loading, KV cache, and MoE), why backends/runtimes are real constraints (CUDA vs ROCm vs Metal/MLX vs CPU), and how to scale from one box to multi-GPU and multi-machine setups.


Keep your AI on a leash.


Links mentioned:

- GPU Glossary (Modal): https://modal.com/gpu-glossary

- CUDA → ROCm headline: https://wccftech.com/the-claude-code-has-managed-to-port-nvidia-cuda-backend-to-rocm-in-just-30-minutes/

- Unsloth PR: https://github.com/unslothai/unsloth/pull/3856

...more
View all episodesView all episodes
Download on the App Store

Domesticating AIBy SoyPete Tech