
Sign up to save your podcasts
Or


Paper: Sequence to Sequence Learning with Neural Networks — Ilya Sutskever, Oriol Vinyals & Quoc Le (2014)
The one-sentence summary: From ‘I ❤ Cats’ to ‘J’ ♥ les chats’ — how two LSTMs started talking to each other and taught the world machine translation.
What It’s About
Picture a relay race where Runner #1 takes a message in English, hands the baton to Runner #2, and—without tripping—Runner #2 sprints across the language barrier to deliver it in fault-free French. That, in spirit, is what Sutskever and friends pulled off in 2014: an encoder–decoder LSTM pipeline that transformed sequences into … well, other sequences. It was the first time a single neural network family tree could listen, remember, and speak—no hand-crafted phrase tables required.
Key Takeaways for Busy Humans
* End-to-End Everything — Say goodbye to hand-engineered pipelines; data in, translation out.
* Universal Interface — Any input/output that can be serialized (audio, code, protein sequences) is fair game.
* Foundation for Attention — The pain of squeezing long sentences into a single vector motivated Bahdanau-style attention one year later, and ultimately the Transformer.
* Encoder–Decoder as a Mindset — Prompts + completions, image captions, even humanoid-robot task planning all echo this two-brain pattern.
“Wolf Bites” — Skimmable Nuggets
* The model beat phrase-based SMT on the WMT’14 English→French benchmark with a BLEU of 34.8—legendary at the time.
* Training one epoch over 12M sentence pairs took ten days on eight NVIDIA K40 GPUs. Today you could replicate the experiment in an afternoon on a single RTX 4090.
* Google Translate quietly adopted seq2seq in late 2016, causing users worldwide to wonder if the product had been possessed by fluent spirits overnight.
Notes: The podcasts for this series are done with Google Notebook and the two podcasters you hear are AI-generated. The sources used to generate today’s “notebook” were: 1) the original paper and 2) this article.
Read the original paper here.
Sources
* Sutskever, I.; Vinyals, O.; Le, Q. V. “Sequence to Sequence Learning with Neural Networks.” Advances in Neural Information Processing Systems 27 (2014).
* Google AI Blog. “A Neural Machine Translation System Per-Sentence BLEU Improvement” (2016).
* Kilcher, Y. “Seq2Seq Explained.” YouTube, 2020.
#Seq2Seq #MachineTranslation #DeepLearning #AIHistory #TheWolfReadsAI #deeplearningwiththewolf #dianawolftorres #deeplearning #sutskever #sequencetosequencelearning #ilyasutskever
By Diana Wolf TorresPaper: Sequence to Sequence Learning with Neural Networks — Ilya Sutskever, Oriol Vinyals & Quoc Le (2014)
The one-sentence summary: From ‘I ❤ Cats’ to ‘J’ ♥ les chats’ — how two LSTMs started talking to each other and taught the world machine translation.
What It’s About
Picture a relay race where Runner #1 takes a message in English, hands the baton to Runner #2, and—without tripping—Runner #2 sprints across the language barrier to deliver it in fault-free French. That, in spirit, is what Sutskever and friends pulled off in 2014: an encoder–decoder LSTM pipeline that transformed sequences into … well, other sequences. It was the first time a single neural network family tree could listen, remember, and speak—no hand-crafted phrase tables required.
Key Takeaways for Busy Humans
* End-to-End Everything — Say goodbye to hand-engineered pipelines; data in, translation out.
* Universal Interface — Any input/output that can be serialized (audio, code, protein sequences) is fair game.
* Foundation for Attention — The pain of squeezing long sentences into a single vector motivated Bahdanau-style attention one year later, and ultimately the Transformer.
* Encoder–Decoder as a Mindset — Prompts + completions, image captions, even humanoid-robot task planning all echo this two-brain pattern.
“Wolf Bites” — Skimmable Nuggets
* The model beat phrase-based SMT on the WMT’14 English→French benchmark with a BLEU of 34.8—legendary at the time.
* Training one epoch over 12M sentence pairs took ten days on eight NVIDIA K40 GPUs. Today you could replicate the experiment in an afternoon on a single RTX 4090.
* Google Translate quietly adopted seq2seq in late 2016, causing users worldwide to wonder if the product had been possessed by fluent spirits overnight.
Notes: The podcasts for this series are done with Google Notebook and the two podcasters you hear are AI-generated. The sources used to generate today’s “notebook” were: 1) the original paper and 2) this article.
Read the original paper here.
Sources
* Sutskever, I.; Vinyals, O.; Le, Q. V. “Sequence to Sequence Learning with Neural Networks.” Advances in Neural Information Processing Systems 27 (2014).
* Google AI Blog. “A Neural Machine Translation System Per-Sentence BLEU Improvement” (2016).
* Kilcher, Y. “Seq2Seq Explained.” YouTube, 2020.
#Seq2Seq #MachineTranslation #DeepLearning #AIHistory #TheWolfReadsAI #deeplearningwiththewolf #dianawolftorres #deeplearning #sutskever #sequencetosequencelearning #ilyasutskever