Julian Zaidi and Marc-André Carbonneau are discussing the practical differences between academic speech research and industrial application within the video game sector. The guests compare the Speech Synthesis Workshop to the broader Interspeech conference, noting that smaller venues allow for more technical discussions on voice cloning and evaluation metrics. They highlight critical challenges for the industry, such as maintaining speaker identity across varied emotions and the necessity for compact models that run efficiently on local hardware. The conversation further explores the cultural shift required when moving to a commercial environment, where consistent reliability and effective communication with non-experts are prioritised over theoretical novelty. Ultimately, the speakers emphasise that while academia thrives on exploration, industry demands functional, high-performance tools that meet the strict expectations of players.
Host: Paige Tuttösi
Post-Production: Zhengjun Yue, Pascal Hecker