Machine Learning Street Talk (MLST)

Ryan Greenblatt - Solving ARC with GPT4o


Listen Later

Ryan Greenblatt from Redwood Research recently published "Getting 50% on ARC-AGI with GPT-4.0," where he used GPT4o to reach a state-of-the-art accuracy on Francois Chollet's ARC Challenge by generating many Python programs.


Sponsor:

Sign up to Kalshi here https://kalshi.onelink.me/1r91/mlst -- the first 500 traders who deposit $100 will get a free $20 credit! Important disclaimer - In case it's not obvious - this is basically gambling and a *high risk* activity - only trade what you can afford to lose.


We discuss:

- Ryan's unique approach to solving the ARC Challenge and achieving impressive results.

- The strengths and weaknesses of current AI models.

- How AI and humans differ in learning and reasoning.

- Combining various techniques to create smarter AI systems.

- The potential risks and future advancements in AI, including the idea of agentic AI.


https://x.com/RyanPGreenblatt

https://www.redwoodresearch.org/



Refs:

Getting 50% (SoTA) on ARC-AGI with GPT-4o [Ryan Greenblatt]

https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt


On the Measure of Intelligence [Chollet]

https://arxiv.org/abs/1911.01547


Connectionism and Cognitive Architecture: A Critical Analysis [Jerry A. Fodor and Zenon W. Pylyshyn]

https://ruccs.rutgers.edu/images/personal-zenon-pylyshyn/proseminars/Proseminar13/ConnectionistArchitecture.pdf


Software 2.0 [Andrej Karpathy]

https://karpathy.medium.com/software-2-0-a64152b37c35


Why Greatness Cannot Be Planned: The Myth of the Objective [Kenneth Stanley]

https://amzn.to/3Wfy2E0


Biographical account of Terence Tao’s mathematical development. [M.A.(KEN) CLEMENTS]

https://gwern.net/doc/iq/high/smpy/1984-clements.pdf


Model Evaluation and Threat Research (METR)

https://metr.org/


Why Tool AIs Want to Be Agent AIs

https://gwern.net/tool-ai


Simulators - Janus

https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators


AI Control: Improving Safety Despite Intentional Subversion

https://www.lesswrong.com/posts/d9FJHawgkiMSPjagR/ai-control-improving-safety-despite-intentional-subversion

https://arxiv.org/abs/2312.06942


What a Compute-Centric Framework Says About Takeoff Speeds

https://www.openphilanthropy.org/research/what-a-compute-centric-framework-says-about-takeoff-speeds/


Global GDP over the long run

https://ourworldindata.org/grapher/global-gdp-over-the-long-run?yScale=log


Safety Cases: How to Justify the Safety of Advanced AI Systems

https://arxiv.org/abs/2403.10462


The Danger of a “Safety Case"

http://sunnyday.mit.edu/The-Danger-of-a-Safety-Case.pdf


The Future Of Work Looks Like A UPS Truck (~02:15:50)

https://www.npr.org/sections/money/2014/05/02/308640135/episode-536-the-future-of-work-looks-like-a-ups-truck


SWE-bench

https://www.swebench.com/


Using DeepSpeed and Megatron to Train Megatron-Turing NLG

530B, A Large-Scale Generative Language Model

https://arxiv.org/pdf/2201.11990


Algorithmic Progress in Language Models

https://epochai.org/blog/algorithmic-progress-in-language-models

...more
View all episodesView all episodes
Download on the App Store

Machine Learning Street Talk (MLST)By Machine Learning Street Talk (MLST)

  • 4.7
  • 4.7
  • 4.7
  • 4.7
  • 4.7

4.7

83 ratings


More shows like Machine Learning Street Talk (MLST)

View all
Data Skeptic by Kyle Polich

Data Skeptic

474 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

429 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

294 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

321 Listeners

Practical AI by Practical AI LLC

Practical AI

197 Listeners

Google DeepMind: The Podcast by Hannah Fry

Google DeepMind: The Podcast

190 Listeners

Last Week in AI by Skynet Today

Last Week in AI

274 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

324 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

103 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

193 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

64 Listeners

"Upstream" with Erik Torenberg by Erik Torenberg

"Upstream" with Erik Torenberg

65 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

421 Listeners

AI + a16z by a16z

AI + a16z

26 Listeners

Training Data by Sequoia Capital

Training Data

31 Listeners