BuzzStream Digital PR and Link Building Podcast

Is Common Crawl Secretly Being Used to Train All LLMs?


Listen Later

In this episode of the BuzzStream podcast, host Vince Nero interviews Metehan Yesilyurt, Chief Growth Officer at AEO Vision, about the significance of AI training data, particularly Common Crawl.

They explore how AI models use this data, the importance of metrics such as page rank and harmonic centrality, and the challenges posed by data accessibility.

The conversation emphasizes the need for relevancy in AI citations and the evolving landscape of AI and SEO.⏰ Chapters00:00 Introduction

02:17 Understanding AI Training Data

04:49 Exploring Common Crawl

07:37 The Role of Common Crawl in AI

10:12 Metrics: Page Rank and Harmonic Centrality

12:31 AI Citations and Brand Positioning

15:21 The Future of AI and Data Sources

17:35 Relevancy vs. Authority in Links

20:20 Challenges with Common Crawl

22:52 Final Thoughts and Future Predictions

🔗 Links and Resources:

Connect with Metehan:

https://www.linkedin.com/in/metehanyesilyurt/

https://aeovision.ai/

His tool and study:

https://metehan.ai/blog/cc-rank/

https://webgraph.metehan.ai/

Common Crawl's article:

https://commoncrawl.org/blog/how-seos-are-using-common-crawls-web-graph-data-for-ai-ranking-signals

BuzzStream's article:

https://www.buzzstream.com/blog/publishers-block-ai-study/

ListIQ:

https://www.buzzstream.com/listiq

...more
View all episodesView all episodes
Download on the App Store

BuzzStream Digital PR and Link Building PodcastBy BuzzStream