This paper provides a comprehensive review of document parsing, a field focused on converting unstructured and semi-structured documents into structured, machine-readable data. It explores two main approaches: modular pipeline systems and end-to-end models based on large vision-language models. The paper examines the core components of document parsing, including layout detection, content extraction (text, tables, mathematical expressions), and relation integration, as well as the challenges each approach faces. The authors provide an overview of key methodologies, datasets, evaluation metrics, and open-source tools, ultimately emphasizing the need for further research and development to advance the field.