The AI Concepts Podcast

Module 2: The MLP Layer - Where Transformers Store Knowledge


Listen Later

Shay explains where a transformer actually stores knowledge: not in attention, but in the MLP (feed-forward) layer. The episode frames the transformer block as a two-step loop: attention moves information between tokens, then the MLP transforms each token’s representation independently to inject learned knowledge.

...more
View all episodesView all episodes
Download on the App Store

The AI Concepts PodcastBy Sheetal ’Shay’ Dhar