
Sign up to save your podcasts
Or
This academic paper presents Missing At Random Structured data (MARS), a framework for valid statistical inference using predictions from black box AI models, particularly neural networks. It addresses the issue of bias propagation when these models are used to impute missing structured data from unstructured sources. MARS achieves robust and efficient estimation by requiring access to a ground truth annotation sample and assuming data are missing at random conditional on observables, linking the problem to classic statistical concepts like causal inference and missing data analysis. The paper demonstrates the applicability of MARS to common economic analyses and discusses practical considerations like aggregating data and selecting appropriate imputation functions.
This academic paper presents Missing At Random Structured data (MARS), a framework for valid statistical inference using predictions from black box AI models, particularly neural networks. It addresses the issue of bias propagation when these models are used to impute missing structured data from unstructured sources. MARS achieves robust and efficient estimation by requiring access to a ground truth annotation sample and assuming data are missing at random conditional on observables, linking the problem to classic statistical concepts like causal inference and missing data analysis. The paper demonstrates the applicability of MARS to common economic analyses and discusses practical considerations like aggregating data and selecting appropriate imputation functions.