Mechanical Dreams

Apertus Tech Report


Listen Later

In this episode:
• Another Week, Another 'Open' Model?: Linda introduces the Apertus paper, framing it as a response to the systemic shortcomings of current open models. Professor Norris questions what makes this one different from the countless other 'open' releases.
• Data Compliance and the Goldfish in the Machine: The hosts dive into Apertus's strict data compliance, including its novel retroactive application of robots.txt and the use of the 'Goldfish' training objective to prevent the model from memorizing its training data.
• More Than Just English: A Truly Global LLM: Linda gets excited about the model's vast multilingual capabilities, trained on over 1800 languages. They discuss the implications for low-resource languages and the significance of a 40% non-English training data mix.
• The Swiss AI Charter and Other Training Secrets: The discussion turns to the technical details of training Apertus, including its unique optimizer and its novel approach to safety alignment using a 'Swiss AI Charter' for controversial topics.
• Final Thoughts: A New Standard for Openness?: Professor Norris and Linda summarize Apertus's contributions, concluding that its commitment to compliance, multilingualism, and full transparency sets a powerful new benchmark for the entire field.
...more
View all episodesView all episodes
Download on the App Store

Mechanical DreamsBy Mechanical Dirk