Life with AI

#84- FineWeb, the best dataset to pre-train LLMs.


Listen Later

Hey guys, in this episode I talk about the FineWeb dataset, the best pre-training open source dataset to date. In the episode I explain how they created the dataset and I also share some results.


Link to the huggingface blog: https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1

Instagram of the podcast: https://www.instagram.com/podcast.lifewithai

Linkedin of the podcast: https://www.linkedin.com/company/life-with-ai

...more
View all episodesView all episodes
Download on the App Store

Life with AIBy Filipe Lauar

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings