
Sign up to save your podcasts
Or


The technical report introduces Aya 23, a family of open-weight, multilingual instruction-tuned language models developed by Cohere For AI that support 23 languages.
Building on the previous Aya 101 model, which prioritized language breadth (covering 101 languages), Aya 23 focuses instead on an experiment in "depth versus breadth". By allocating more model capacity to fewer languages included during pre-training, Aya 23 effectively mitigates the well-documented "curse of multilinguality," a phenomenon where a model's performance on individual languages drops when it is forced to share capacity across too many languages.
Key highlights of the paper include:
By Yun WuThe technical report introduces Aya 23, a family of open-weight, multilingual instruction-tuned language models developed by Cohere For AI that support 23 languages.
Building on the previous Aya 101 model, which prioritized language breadth (covering 101 languages), Aya 23 focuses instead on an experiment in "depth versus breadth". By allocating more model capacity to fewer languages included during pre-training, Aya 23 effectively mitigates the well-documented "curse of multilinguality," a phenomenon where a model's performance on individual languages drops when it is forced to share capacity across too many languages.
Key highlights of the paper include: