Sébastien Bratières, Director of AI at Translated, outlines “DVPS”, a four-year EU project to build multimodal foundation models with partners including EPFL, Oxford, and ETH Zurich. We track why language tech must move beyond text to “physical AI,” integrating speech, video, handwriting, environment (satellite imagery), and cardiology data so models learn meaning from real-world context.