
Sign up to save your podcasts
Or


The provided paper introduces phi-4, a 14-billion parameter language model developed by Microsoft Research. Unlike typical language models that rely primarily on organic web data, phi-4 achieves state-of-the-art performance for its size by intensely focusing on data quality and strategically integrating synthetic data throughout its entire training process.
The model's development is built upon three core pillars:
As a result of these data-centric innovations, phi-4 matches or outperforms much larger foundation models on reasoning-related tasks. Notably, it significantly surpasses its teacher model, GPT-4o, on highly complex benchmarks such as GPQA (graduate-level STEM Q&A) and MATH.
By Yun WuThe provided paper introduces phi-4, a 14-billion parameter language model developed by Microsoft Research. Unlike typical language models that rely primarily on organic web data, phi-4 achieves state-of-the-art performance for its size by intensely focusing on data quality and strategically integrating synthetic data throughout its entire training process.
The model's development is built upon three core pillars:
As a result of these data-centric innovations, phi-4 matches or outperforms much larger foundation models on reasoning-related tasks. Notably, it significantly surpasses its teacher model, GPT-4o, on highly complex benchmarks such as GPQA (graduate-level STEM Q&A) and MATH.