Linear Digressions

How to evaluate a translation: BLEU scores


Listen Later

As anyone who's encountered a badly translated text could tell you, not all translations are created equal. Some translations are smooth, fluent and sound like a poet wrote them; some are jerky, non-grammatical and awkward. When a machine is doing the translating, it's awfully easy to end up with a robotic-sounding text; as the state of the art in machine translation improves, though, a natural question to ask is: according to what measure? How do we quantify a "good" translation?
Enter the BLEU score, which is the standard metric for quantifying the quality of a machine translation. BLEU rewards translations that have large overlap with human translations of sentences, with some extra heuristics thrown in to guard against weird pathologies (like full sentences getting translated as one word, redundancies, and repetition). Nowadays, if there's a machine translation being evaluated or a new state-of-the-art system (like the Google neural machine translation we've discussed on this podcast before), chances are that there's a BLEU score going into that assessment.
...more
View all episodesView all episodes
Download on the App Store

Linear DigressionsBy Ben Jaffe and Katie Malone

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

353 ratings


More shows like Linear Digressions

View all
Stuff You Should Know by iHeartPodcasts

Stuff You Should Know

78,613 Listeners

Practical AI by Practical AI LLC

Practical AI

200 Listeners