The New Stack Podcast

How Training Data Differentiates Falcon, the LLM from the UAE


Listen Later

The name "Falcon" for the UAE’s large language model (LLM) symbolizes the national bird's qualities of courage and perseverance, reflecting the vision of the Technology Innovation Institute (TII) in Abu Dhabi. TII, launched in 2020, addresses AI’s rapid advancements and unintended consequences by fostering an open-source approach to enhance community understanding and control of AI. In this New Stack Makers, Dr. Hakim Hacid, Executive Director and Acting Chief Researcher, Technology Innovation Institute emphasized the importance of perseverance and innovation in overcoming challenges. Falcon gained attention for being the first truly open model with capabilities matching many closed-source models, opening new possibilities for practitioners and industry. 

Last June, Falcon introduced a 40-billion parameter model, outperforming the LLaMA-65B, with smaller models enabling local inference without the cloud. The latest 180-billion parameter model, trained on 3.5 trillion tokens, illustrates Falcon’s commitment to quality and efficiency over sheer size. Falcon’s distinctiveness lies in its data quality, utilizing over 80% RefinedWeb data, based on CommonCrawl, which ensures cleaner and deduplicated data, resulting in high-quality outcomes. This data-centric approach, combined with powerful computational resources, sets Falcon apart in the AI landscape.

 

Learn more from The New Stack about Open Source AI: 

Open Source Initiative Hits the Road to Define Open Source AI 

 Linus Torvalds on Security, AI, Open Source and Trust

Transparency and Community: An Open Source Vision for AI 

Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

...more
View all episodesView all episodes
Download on the App Store

The New Stack PodcastBy The New Stack

  • 4.3
  • 4.3
  • 4.3
  • 4.3
  • 4.3

4.3

31 ratings


More shows like The New Stack Podcast

View all
The New Stack Analysts by The New Stack

The New Stack Analysts

9 Listeners

The New Stack @ Scale by The New Stack

The New Stack @ Scale

3 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,089 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

625 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

43 Listeners

The New Stack Context by The New Stack

The New Stack Context

4 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

226 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

988 Listeners

CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

CoRecursive: Coding Stories

190 Listeners

Practical AI by Practical AI LLC

Practical AI

211 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

511 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

494 Listeners

AI and I by Dan Shipper

AI and I

33 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

467 Listeners

AI + a16z by a16z

AI + a16z

35 Listeners