
Sign up to save your podcasts
Or


Join us for Part 1 of a 3-part series as Du'An Lightfoot (Senior AI Engineer at Akamai) breaks down everything you need to know to get started running AI models locally on your own hardware.
Du'An walks through the fundamentals of local AI - from understanding why you'd want to run models privately (data ownership, air-gapped environments, IP protection) to the hardware concepts that make it possible. You'll learn how inference actually works under the hood, why GPUs matter for AI workloads, how to choose and quantize models for your hardware, and how to get up and running with tools like Ollama. This is Part 1 of a 3-part series - future episodes cover serving models via API and distributing inference at the edge with Kubernetes.
Timestamps
0:00 Cold Open: Why Local AI?
0:26 Welcome & Introduction
1:30 Following Up on the Frontier Models Episode
2:25 Du'An's Background & AI Inference at Akamai
3:49 What If You Wanted to Own Your Data?
5:00 Local AI vs Cloud AI: A Different Layer of the Stack
5:47 Why GPUs Matter: The Nvidia Story
7:03 CPU vs GPU: Serial vs Parallel Processing
8:28 Model Weights & Quantization Explained
How to find Du'An:
https://www.duanlightfoot.com/
https://github.com/labeveryday/
Links from the show:
https://ollama.com/
https://apxml.com/
https://localllm.in/
https://huggingface.co/
https://github.com/labeveryday/claude-put-in-work
https://claude.ai/
By vBrownBag4.7
3434 ratings
Join us for Part 1 of a 3-part series as Du'An Lightfoot (Senior AI Engineer at Akamai) breaks down everything you need to know to get started running AI models locally on your own hardware.
Du'An walks through the fundamentals of local AI - from understanding why you'd want to run models privately (data ownership, air-gapped environments, IP protection) to the hardware concepts that make it possible. You'll learn how inference actually works under the hood, why GPUs matter for AI workloads, how to choose and quantize models for your hardware, and how to get up and running with tools like Ollama. This is Part 1 of a 3-part series - future episodes cover serving models via API and distributing inference at the edge with Kubernetes.
Timestamps
0:00 Cold Open: Why Local AI?
0:26 Welcome & Introduction
1:30 Following Up on the Frontier Models Episode
2:25 Du'An's Background & AI Inference at Akamai
3:49 What If You Wanted to Own Your Data?
5:00 Local AI vs Cloud AI: A Different Layer of the Stack
5:47 Why GPUs Matter: The Nvidia Story
7:03 CPU vs GPU: Serial vs Parallel Processing
8:28 Model Weights & Quantization Explained
How to find Du'An:
https://www.duanlightfoot.com/
https://github.com/labeveryday/
Links from the show:
https://ollama.com/
https://apxml.com/
https://localllm.in/
https://huggingface.co/
https://github.com/labeveryday/claude-put-in-work
https://claude.ai/