
Sign up to save your podcasts
Or


Everyone keeps calling NVIDIA DGX Spark an “inference box”… but in practice it behaves more like a dev rig.
In Ep 2 of Domesticating AI, we break down what Spark is actually good for (AI development + fine-tuning) vs what it isn’t (a magical drop-in inference server). We also dig into why unified memory changes the local-AI experience, the “gateway stack” (Ollama + Open WebUI), when you outgrow turnkey UIs, and how homelab economics + networking decisions shape what you should run at home.
In this episode
Training vs inference (and why “inference server” gets misused)
Unified memory: what it changes for model loading + workflows
Ollama + Open WebUI as the fastest on-ramp for local AI
Fine-tuning workflows (QLoRA/Unsloth-style) and where Spark shines
Homelab reality: Docker “recipes,” troubleshooting, and collaboration
Safer remote access: Tailscale
Cloud vs home economics (when cloud is cheaper… and when it explodes)
NVIDIA / DGX Spark
DGX Spark: https://www.nvidia.com/en-us/products/workstations/dgx-spark/
Build hub / recipes: https://build.nvidia.com/spark
NIM on Spark playbook: https://build.nvidia.com/spark/nim-llm
Local AI runners + UIs
Ollama: https://ollama.com/
Open WebUI (GitHub): https://github.com/open-webui/open-webui
Open WebUI docs: https://docs.openwebui.com/
llama.cpp: https://github.com/ggml-org/llama.cpp
LM Studio: https://lmstudio.ai/
vLLM: https://github.com/vllm-project/vllm
Jan: https://jan.ai/
Fine-tuning + workflows
Unsloth: https://github.com/unslothai/unsloth
Image generation tools (mentioned)
ComfyUI: https://github.com/Comfy-Org/ComfyUI
AUTOMATIC1111 SD WebUI: https://github.com/AUTOMATIC1111/stable-diffusion-webui
Networking / Remote access
Tailscale: https://tailscale.com/
Cloud GPU alternatives (mentioned)
Runpod pricing: https://www.runpod.io/pricing
Modal pricing: https://modal.com/pricing
Miriah Peterson (Host): Miriah Peterson is a software engineer, Go educator, and community builder focused on production-first AI—treating LLM systems like real software with real users. She runs SoyPete Tech (streams + writing + open-source projects) and stays active in the Utah dev community through meetups and events, with a practical focus on shipping local and cloud AI systems.
Connect:
SoyPete Tech (YouTube): https://www.youtube.com/@SoyPete_Tech
SoyPete Tech (Substack): https://soypetetech.substack.com/
LinkedIn: https://www.linkedin.com/in/miriah-peterson-35649b5b/
Matt Sharp (Host): Matt Sharp is an AI Engineer and Strategist for a tech consulting firm and co-author of LLMs in Production. He’s a recovering data scientist and MLOps expert with 10+ years of experience operationalizing ML systems in production. Matt also teaches a graduate-level MLOps-in-production course at Utah State University as an adjunct professor. You can find him on Substack (Data Pioneer), LinkedIn, and on his other podcast, the Learning Curve.
Connect:
Data Pioneer (Substack): https://thedatapioneer.substack.com/
Chris Brousseau (Host): Chris Brousseau is a linguist by training and an NLP practitioner by trade, with a career spanning linguistically informed NLP, modern LLM systems, and MLOps practices. He’s co-author of LLMs in Production and is currently VP of AI at VEOX. You can find him as IMJONEZZ (two Z’s) on YouTube, GitHub, and on LinkedIn.
Connect:
YouTube (IMJONEZZ): https://www.youtube.com/channel/UCPtkaw_x97yP4WevW7axk0g
LinkedIn: https://www.linkedin.com/in/chris-brousseau/en
📘 LLMs in Production (Matt Sharp & Chris Brousseau): https://www.manning.com/books/llms-in-production
Links & ResourcesHosts
By SoyPete TechEveryone keeps calling NVIDIA DGX Spark an “inference box”… but in practice it behaves more like a dev rig.
In Ep 2 of Domesticating AI, we break down what Spark is actually good for (AI development + fine-tuning) vs what it isn’t (a magical drop-in inference server). We also dig into why unified memory changes the local-AI experience, the “gateway stack” (Ollama + Open WebUI), when you outgrow turnkey UIs, and how homelab economics + networking decisions shape what you should run at home.
In this episode
Training vs inference (and why “inference server” gets misused)
Unified memory: what it changes for model loading + workflows
Ollama + Open WebUI as the fastest on-ramp for local AI
Fine-tuning workflows (QLoRA/Unsloth-style) and where Spark shines
Homelab reality: Docker “recipes,” troubleshooting, and collaboration
Safer remote access: Tailscale
Cloud vs home economics (when cloud is cheaper… and when it explodes)
NVIDIA / DGX Spark
DGX Spark: https://www.nvidia.com/en-us/products/workstations/dgx-spark/
Build hub / recipes: https://build.nvidia.com/spark
NIM on Spark playbook: https://build.nvidia.com/spark/nim-llm
Local AI runners + UIs
Ollama: https://ollama.com/
Open WebUI (GitHub): https://github.com/open-webui/open-webui
Open WebUI docs: https://docs.openwebui.com/
llama.cpp: https://github.com/ggml-org/llama.cpp
LM Studio: https://lmstudio.ai/
vLLM: https://github.com/vllm-project/vllm
Jan: https://jan.ai/
Fine-tuning + workflows
Unsloth: https://github.com/unslothai/unsloth
Image generation tools (mentioned)
ComfyUI: https://github.com/Comfy-Org/ComfyUI
AUTOMATIC1111 SD WebUI: https://github.com/AUTOMATIC1111/stable-diffusion-webui
Networking / Remote access
Tailscale: https://tailscale.com/
Cloud GPU alternatives (mentioned)
Runpod pricing: https://www.runpod.io/pricing
Modal pricing: https://modal.com/pricing
Miriah Peterson (Host): Miriah Peterson is a software engineer, Go educator, and community builder focused on production-first AI—treating LLM systems like real software with real users. She runs SoyPete Tech (streams + writing + open-source projects) and stays active in the Utah dev community through meetups and events, with a practical focus on shipping local and cloud AI systems.
Connect:
SoyPete Tech (YouTube): https://www.youtube.com/@SoyPete_Tech
SoyPete Tech (Substack): https://soypetetech.substack.com/
LinkedIn: https://www.linkedin.com/in/miriah-peterson-35649b5b/
Matt Sharp (Host): Matt Sharp is an AI Engineer and Strategist for a tech consulting firm and co-author of LLMs in Production. He’s a recovering data scientist and MLOps expert with 10+ years of experience operationalizing ML systems in production. Matt also teaches a graduate-level MLOps-in-production course at Utah State University as an adjunct professor. You can find him on Substack (Data Pioneer), LinkedIn, and on his other podcast, the Learning Curve.
Connect:
Data Pioneer (Substack): https://thedatapioneer.substack.com/
Chris Brousseau (Host): Chris Brousseau is a linguist by training and an NLP practitioner by trade, with a career spanning linguistically informed NLP, modern LLM systems, and MLOps practices. He’s co-author of LLMs in Production and is currently VP of AI at VEOX. You can find him as IMJONEZZ (two Z’s) on YouTube, GitHub, and on LinkedIn.
Connect:
YouTube (IMJONEZZ): https://www.youtube.com/channel/UCPtkaw_x97yP4WevW7axk0g
LinkedIn: https://www.linkedin.com/in/chris-brousseau/en
📘 LLMs in Production (Matt Sharp & Chris Brousseau): https://www.manning.com/books/llms-in-production
Links & ResourcesHosts