
Sign up to save your podcasts
Or


Domesticating AI — S01E01: Your First AI at Home
Hosts: Miriah Peterson, Matt Sharp, Chris Brousseau
This episode is your practical on-ramp to running AI at home: why inference engines matter, what to install first, and how to make “local AI” feel stable instead of fragile. The hosts start with a hardware + market reality check (tinygrad’s tinybox-style “AI server appliance” idea and the ongoing memory/RAM crunch), then break down what an inference engine actually does, how popular runtimes compare (llama.cpp, vLLM, Ollama, TGI), and a sane starter workflow for getting from “downloaded a model” to “usable local AI.”
0:00 Intro + host chaos + what the show is
1:08 News: tinygrad / “AI server appliance” thinking (tinybox vibes)
2:44 News: RAM prices + the memory crunch for builders
8:26 Main: building your first AI at home (why now)
8:49 What is an inference engine?
12:30 Engines compared: llama.cpp vs vLLM vs Ollama vs TGI
15:42 Do you need to buy a new computer? (CPU vs GPU realities)
25:32 Models for home: fit-to-hardware, quantization, context
34:37 Leaderboards vs evals: picking models you can trust
44:00 Community + meetups + where to follow
45:22 Outro — “Keep your AI on a leash”
News / context
Inference engines
By SoyPete TechDomesticating AI — S01E01: Your First AI at Home
Hosts: Miriah Peterson, Matt Sharp, Chris Brousseau
This episode is your practical on-ramp to running AI at home: why inference engines matter, what to install first, and how to make “local AI” feel stable instead of fragile. The hosts start with a hardware + market reality check (tinygrad’s tinybox-style “AI server appliance” idea and the ongoing memory/RAM crunch), then break down what an inference engine actually does, how popular runtimes compare (llama.cpp, vLLM, Ollama, TGI), and a sane starter workflow for getting from “downloaded a model” to “usable local AI.”
0:00 Intro + host chaos + what the show is
1:08 News: tinygrad / “AI server appliance” thinking (tinybox vibes)
2:44 News: RAM prices + the memory crunch for builders
8:26 Main: building your first AI at home (why now)
8:49 What is an inference engine?
12:30 Engines compared: llama.cpp vs vLLM vs Ollama vs TGI
15:42 Do you need to buy a new computer? (CPU vs GPU realities)
25:32 Models for home: fit-to-hardware, quantization, context
34:37 Leaderboards vs evals: picking models you can trust
44:00 Community + meetups + where to follow
45:22 Outro — “Keep your AI on a leash”
News / context
Inference engines