A single generalist VLA built on Qwen3.5-4B + 1.15B DiT flow-matching action decoder that unifies manipulation, navigation, and trajectory prediction across 11 embodiments via text-described embodiment prompts. Trained in four stages and outperforms task-specific specialists on real ALOHA and sim benchmarks without per-task fine-tuning.