December 13, 2024

Abstracts: NeurIPS 2024 with Jindong Wang and Steven Euijong Whang

Listen Later

11 minutes

Researcher Jindong Wang and Associate Professor Steven Euijong Whang explore the NeurIPS 2024 work ERBench. ERBench leverages relational databases to create LLM benchmarks that can verify model rationale via keywords in addition to checking answer correctness.

Read the paper

Get datasets and codes

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

Microsoft Research Podcast

By Researchers across the Microsoft research community

4.8

8080 ratings

December 13, 2024

Abstracts: NeurIPS 2024 with Jindong Wang and Steven Euijong Whang

Listen Later

11 minutes

Researcher Jindong Wang and Associate Professor Steven Euijong Whang explore the NeurIPS 2024 work ERBench. ERBench leverages relational databases to create LLM benchmarks that can verify model rationale via keywords in addition to checking answer correctness.

Read the paper

Get datasets and codes

...more

More shows like Microsoft Research Podcast

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,040 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

481 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

441 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

298 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

331 Listeners

The Future of Everything by Stanford Engineering

The Future of Everything

127 Listeners

AI Today Podcast by AI & Data Today

AI Today Podcast

156 Listeners

Practical AI by Practical AI LLC

Practical AI

192 Listeners

Google DeepMind: The Podcast by Hannah Fry

Google DeepMind: The Podcast

198 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

88 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

454 Listeners

MIT Technology Review Narrated by MIT Technology Review

MIT Technology Review Narrated

259 Listeners

WorkLab by Microsoft

WorkLab

61 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

75 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

491 Listeners