February 02, 2026

303 - How LLMs Work - the 20 minute explainer

25 minutes

Ever get asked "how do LLMs work?" at a party and freeze? We walk through the full pipeline: tokenization, embeddings, inference — so you understand it well enough to explain it. Walk away with a mental model that you can use for your next dinner party.

_Full shownotes at fragmentedpodcast.com.

Show NotesWords -> Tokens:

OpenAI Tokenizer visualizer -
Visualize how text becomes tokens

Tokens -> Embeddings:

RGB Color model - wikipedia
Word2Vec technique - wikipedia
- Efficient Estimation of Word Representation -
  original Word2Vec paper by Mikolov et al.

Embeddings -> Inference:

Word embedding
Temperature, Top-k, Top-p samping

Get in touch

We'd love to hear from you. Email is the
best way to reach us or you can check our contact page for other
ways.

We want to hear all the feedback: what's working, what's not, topics you'd like
to hear more on. We want to make the show better for you so let us know!

Contact us
Newsletter
Youtube
Website

Co-hosts:

Kaushik Gopal
Iury Souza

[!fyi] We transitioned from Android development to AI starting with
Ep. #300. Listen to that episode for the full story behind
our new direction.

...more

View all episodes

By Kaushik Gopal, Iury Souza

6868 ratings