Fragmented - AI Developer Podcast

308 - How Image Diffusion Models Work - the 20 minute explainer


Listen Later

You already know how LLMs work from our popular 20-minute explainer. Now we take it to images. What does Michelangelo have to do with stable diffusion? More than you'd think. Walk away knowing how image generation actually works — and what it has in common with the text models you already understand.

Full shownotes at fragmentedpodcast.com.

Show Notes
  • Episode 303 - How LLMs work in 20 minutes - text generation
  • VAE -
    Variational Autoencoder
  • RGB Color model - wikipedia
  • Word2Vec technique - wikipedia
    • Efficient Estimation of Word Representation -
    • original Word2Vec paper by Mikolov et al.
    • High-Resolution Image Synthesis with Latent Diffusion Models -
    • Rombach et al. (2022) — the paper behind Stable Diffusion
    • Image Training data
      • LAION-5B - 5 billion image-text pairs
      • scraped from the web, used to train many image generation models
      • WebLI - Google's internal image-text
      • dataset
      • Michelangelo
      • Get in touch

        We'd love to hear from you. Email is the

        best way to reach us or you can check our contact page for other
        ways.

        We want to hear all the feedback: what's working, what's not, topics you'd like

        to hear more on.

        • Contact us
        • Newsletter
        • Youtube
        • Website
        • Co-hosts:
          • Kaushik Gopal
          • Iury Souza
          • [!fyi] We transitioned from Android development to AI starting with
            Ep. #300. Listen to that episode for the full story behind

            our new direction.

            ...more
            View all episodesView all episodes
            Download on the App Store

            Fragmented - AI Developer PodcastBy Kaushik Gopal, Iury Souza

            • 5
            • 5
            • 5
            • 5
            • 5

            5

            68 ratings


            More shows like Fragmented - AI Developer Podcast

            View all
            Design Details by Brian Lovin, Marshall Bock

            Design Details

            360 Listeners

            Developer Tea by Jonathan Cutrell

            Developer Tea

            402 Listeners

            Does Not Compute by Sean Washington, Rockwell Schrock

            Does Not Compute

            53 Listeners

            Toolsday by Una Kravets, Chris Dhanaraj

            Toolsday

            48 Listeners

            Swift Unwrapped by JP Simard, Jesse Squires, Spec Network, Inc.

            Swift Unwrapped

            90 Listeners