LessWrong posts by zvi

“Gemini 3.1 Pro Aces Benchmarks, I Suppose” by Zvi


Listen Later

I’ve been trying to find a slot for this one for a while. I am thrilled that today had sufficiently little news that I am comfortable posting this.

Gemini 3.1 scores very well on benchmarks, but most of us had the same reaction after briefly trying it: “It's a Gemini model.”

And that was that, given our alternatives. But it's got its charms.

Consider this a nice little, highly skippable break.

The Pitch

It's a good model, sir. That's the pitch.

Sundar Pichai (CEO Google): Gemini 3.1 Pro is here. Hitting 77.1% on ARC-AGI-2, it's a step forward in core reasoning (more than 2x 3 Pro).

With a more capable baseline, it's great for super complex tasks like visualizing difficult concepts, synthesizing data into a single view, or bringing creative projects to life.

We’re shipping 3.1 Pro across our consumer and developer products to bring this underlying leap in intelligence to your everyday applications right away.

Jeff Dean also highlighted ARC-AGI-2 along with some cool animations, an urban planning sim, some heat transfer analysis and the general benchmarks.

On Your Marks

Google presents a good standard set of [...]

---

Outline:

(00:37) The Pitch

(01:31) On Your Marks

(04:34) Other Peoples Benchmarks

(06:54) Gemini 3 DeepThink V2

(12:33) Positive Feedback

(17:22) Negative Feedback

(19:07) Try Gemini Lite

---

First published:

March 4th, 2026

Source:

https://www.lesswrong.com/posts/82zizPyyPgaEswbxz/gemini-3-1-pro-aces-benchmarks-i-suppose

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong posts by zviBy zvi

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like LessWrong posts by zvi

View all
Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,276 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,448 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,106 Listeners

Future of Life Institute Podcast by Future of Life Institute

Future of Life Institute Podcast

108 Listeners

ChinaTalk by Jordan Schneider

ChinaTalk

289 Listeners

Politix by Politix

Politix

89 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

563 Listeners

Hard Fork by The New York Times

Hard Fork

5,549 Listeners

Clearer Thinking with Spencer Greenberg by Spencer Greenberg

Clearer Thinking with Spencer Greenberg

138 Listeners

LessWrong (Curated & Popular) by LessWrong

LessWrong (Curated & Popular)

12 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

146 Listeners

"Econ 102" with Noah Smith and Erik Torenberg by Turpentine

"Econ 102" with Noah Smith and Erik Torenberg

149 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

461 Listeners

LessWrong (30+ Karma) by LessWrong

LessWrong (30+ Karma)

0 Listeners

Complex Systems with Patrick McKenzie (patio11) by Patrick McKenzie

Complex Systems with Patrick McKenzie (patio11)

141 Listeners