Best AI papers explained

Inference for Regression with Variables Generated by AI or Machine Learning


Listen Later

This research investigates how using artificial intelligence (AI) or machine learning (ML) to generate variables for economic regressions can lead to biased estimates and invalid statistical inference. While researchers often treat AI-generated outputs as standard data, the authors demonstrate that measurement error in these variables—even from high-performance algorithms—shifts the centering of confidence intervals, making them unreliable. To address these distortions, the paper introduces two practical solutions: a mathematical bias correction that does not require ground-truth validation data and a joint estimation framework that models the latent variables and regression parameters simultaneously. The effectiveness of these methods is illustrated through diverse applications, including job posting classifications, CEO time-use analysis, and central bank sentiment indexing. Ultimately, the study provides a robust toolkit for economists to maintain statistical integrity when integrating modern computational tools into empirical research.

...more
View all episodesView all episodes
Download on the App Store

Best AI papers explainedBy Enoch H. Kang