February 13, 2026

From “Inference Box” to Dev Rig: What NVIDIA DGX Spark Actually Is | Ep 2

43 minutes

Everyone keeps calling NVIDIA DGX Spark an “inference box”… but in practice it behaves more like a dev rig.

In Ep 2 of Domesticating AI, we break down what Spark is actually good for (AI development + fine-tuning) vs what it isn’t (a magical drop-in inference server). We also dig into why unified memory changes the local-AI experience, the “gateway stack” (Ollama + Open WebUI), when you outgrow turnkey UIs, and how homelab economics + networking decisions shape what you should run at home.

In this episode

Training vs inference (and why “inference server” gets misused)
Unified memory: what it changes for model loading + workflows
Ollama + Open WebUI as the fastest on-ramp for local AI
Fine-tuning workflows (QLoRA/Unsloth-style) and where Spark shines
Homelab reality: Docker “recipes,” troubleshooting, and collaboration
Safer remote access: Tailscale
Cloud vs home economics (when cloud is cheaper… and when it explodes)

NVIDIA / DGX Spark

DGX Spark: https://www.nvidia.com/en-us/products/workstations/dgx-spark/
Build hub / recipes: https://build.nvidia.com/spark
NIM on Spark playbook: https://build.nvidia.com/spark/nim-llm

Local AI runners + UIs

Ollama: https://ollama.com/
Open WebUI (GitHub): https://github.com/open-webui/open-webui
Open WebUI docs: https://docs.openwebui.com/
llama.cpp: https://github.com/ggml-org/llama.cpp
LM Studio: https://lmstudio.ai/
vLLM: https://github.com/vllm-project/vllm
Jan: https://jan.ai/

Fine-tuning + workflows

Unsloth: https://github.com/unslothai/unsloth

Image generation tools (mentioned)

ComfyUI: https://github.com/Comfy-Org/ComfyUI
AUTOMATIC1111 SD WebUI: https://github.com/AUTOMATIC1111/stable-diffusion-webui

Networking / Remote access

Tailscale: https://tailscale.com/

Cloud GPU alternatives (mentioned)

Runpod pricing: https://www.runpod.io/pricing
Modal pricing: https://modal.com/pricing

Miriah Peterson (Host): Miriah Peterson is a software engineer, Go educator, and community builder focused on production-first AI—treating LLM systems like real software with real users. She runs SoyPete Tech (streams + writing + open-source projects) and stays active in the Utah dev community through meetups and events, with a practical focus on shipping local and cloud AI systems.
Connect:

SoyPete Tech (YouTube): https://www.youtube.com/@SoyPete_Tech
SoyPete Tech (Substack): https://soypetetech.substack.com/
LinkedIn: https://www.linkedin.com/in/miriah-peterson-35649b5b/

Matt Sharp (Host): Matt Sharp is an AI Engineer and Strategist for a tech consulting firm and co-author of LLMs in Production. He’s a recovering data scientist and MLOps expert with 10+ years of experience operationalizing ML systems in production. Matt also teaches a graduate-level MLOps-in-production course at Utah State University as an adjunct professor. You can find him on Substack (Data Pioneer), LinkedIn, and on his other podcast, the Learning Curve.
Connect:

Data Pioneer (Substack): https://thedatapioneer.substack.com/

Chris Brousseau (Host): Chris Brousseau is a linguist by training and an NLP practitioner by trade, with a career spanning linguistically informed NLP, modern LLM systems, and MLOps practices. He’s co-author of LLMs in Production and is currently VP of AI at VEOX. You can find him as IMJONEZZ (two Z’s) on YouTube, GitHub, and on LinkedIn.
Connect:

YouTube (IMJONEZZ): https://www.youtube.com/channel/UCPtkaw_x97yP4WevW7axk0g
LinkedIn: https://www.linkedin.com/in/chris-brousseau/en

📘 LLMs in Production (Matt Sharp & Chris Brousseau): https://www.manning.com/books/llms-in-production

Links & ResourcesHosts

...more