
Sign up to save your podcasts
Or


Interested in being a guest? Email us at [email protected]
AI feels fast until memory and storage slow everything down. We sit with Michael Wu to unpack a blunt truth: inference is where value happens, and storage now sits on the critical path. Instead of treating SSDs as cold capacity, Phison’s adaptive middleware turns them into a live cache that expands usable memory and keeps models, embeddings, and long context windows close to compute. The payoff is practical and immediate—lean AIPCs and mini workstations run bigger workloads with steadier latency, and teams can scale inference without waiting for DRAM supply to catch up.
We trace the story from CES announcements to real-world deployment. Michael breaks down how OEM integrations and consumer upgrade kits bring adaptive caching to both new and existing machines, why developer and education communities are the first winners, and how this bottom-up momentum seeds better software and on-device AI experiences. For enterprise leaders, we map the route from local experiments to global rollouts: consistent performance across distributed teams, lower cloud egress, and a storage layer tuned for retrieval-augmented generation, fine-tuning, and high-concurrency serving.
Zooming out, we explore where we are in the AI cycle—early, hungry, and building—and how edge devices and “physical AI” will broaden demand for fast, cache-aware storage. Michael also shares Phison’s fabless strategy, the new Pascari enterprise lineup, and the push toward Gen 6 performance that aligns with next-gen model serving. If you care about real-world AI velocity, this conversation shows how to turn a bottleneck into an advantage by rethinking the memory hierarchy from the ground up.
If this helped you think differently about scaling AI, follow the show, share it with a teammate, and leave a quick review so more builders can find it. What’s the first AI workflow you’d speed up with adaptive caching?
Support the show
More at https://linktr.ee/EvanKirstel
By Evan KirstelInterested in being a guest? Email us at [email protected]
AI feels fast until memory and storage slow everything down. We sit with Michael Wu to unpack a blunt truth: inference is where value happens, and storage now sits on the critical path. Instead of treating SSDs as cold capacity, Phison’s adaptive middleware turns them into a live cache that expands usable memory and keeps models, embeddings, and long context windows close to compute. The payoff is practical and immediate—lean AIPCs and mini workstations run bigger workloads with steadier latency, and teams can scale inference without waiting for DRAM supply to catch up.
We trace the story from CES announcements to real-world deployment. Michael breaks down how OEM integrations and consumer upgrade kits bring adaptive caching to both new and existing machines, why developer and education communities are the first winners, and how this bottom-up momentum seeds better software and on-device AI experiences. For enterprise leaders, we map the route from local experiments to global rollouts: consistent performance across distributed teams, lower cloud egress, and a storage layer tuned for retrieval-augmented generation, fine-tuning, and high-concurrency serving.
Zooming out, we explore where we are in the AI cycle—early, hungry, and building—and how edge devices and “physical AI” will broaden demand for fast, cache-aware storage. Michael also shares Phison’s fabless strategy, the new Pascari enterprise lineup, and the push toward Gen 6 performance that aligns with next-gen model serving. If you care about real-world AI velocity, this conversation shows how to turn a bottleneck into an advantage by rethinking the memory hierarchy from the ground up.
If this helped you think differently about scaling AI, follow the show, share it with a teammate, and leave a quick review so more builders can find it. What’s the first AI workflow you’d speed up with adaptive caching?
Support the show
More at https://linktr.ee/EvanKirstel