LessWrong posts by zvi

“Gemini 3.1 Pro Aces Benchmarks, I Suppose” by Zvi


Listen Later

I’ve been trying to find a slot for this one for a while. I am thrilled that today had sufficiently little news that I am comfortable posting this.

Gemini 3.1 scores very well on benchmarks, but most of us had the same reaction after briefly trying it: “It's a Gemini model.”

And that was that, given our alternatives. But it's got its charms.

Consider this a nice little, highly skippable break.

The Pitch

It's a good model, sir. That's the pitch.

Sundar Pichai (CEO Google): Gemini 3.1 Pro is here. Hitting 77.1% on ARC-AGI-2, it's a step forward in core reasoning (more than 2x 3 Pro).

With a more capable baseline, it's great for super complex tasks like visualizing difficult concepts, synthesizing data into a single view, or bringing creative projects to life.

We’re shipping 3.1 Pro across our consumer and developer products to bring this underlying leap in intelligence to your everyday applications right away.

Jeff Dean also highlighted ARC-AGI-2 along with some cool animations, an urban planning sim, some heat transfer analysis and the general benchmarks.

On Your Marks

Google presents a good standard set of [...]

---

Outline:

(00:37) The Pitch

(01:31) On Your Marks

(04:34) Other Peoples Benchmarks

(06:54) Gemini 3 DeepThink V2

(12:33) Positive Feedback

(17:22) Negative Feedback

(19:07) Try Gemini Lite

---

First published:

March 4th, 2026

Source:

https://www.lesswrong.com/posts/82zizPyyPgaEswbxz/gemini-3-1-pro-aces-benchmarks-i-suppose

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong posts by zviBy zvi

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like LessWrong posts by zvi

View all
Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,414 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,453 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,095 Listeners

Future of Life Institute Podcast by Future of Life Institute

Future of Life Institute Podcast

108 Listeners

ChinaTalk by Jordan Schneider

ChinaTalk

292 Listeners

Politix by Politix

Politix

89 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

562 Listeners

Hard Fork by The New York Times

Hard Fork

5,592 Listeners

Clearer Thinking with Spencer Greenberg by Spencer Greenberg

Clearer Thinking with Spencer Greenberg

136 Listeners

LessWrong (Curated & Popular) by LessWrong

LessWrong (Curated & Popular)

14 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

145 Listeners

"Econ 102" with Noah Smith and Erik Torenberg by Turpentine

"Econ 102" with Noah Smith and Erik Torenberg

151 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

459 Listeners

LessWrong (30+ Karma) by LessWrong

LessWrong (30+ Karma)

0 Listeners

Complex Systems with Patrick McKenzie (patio11) by Patrick McKenzie

Complex Systems with Patrick McKenzie (patio11)

142 Listeners