
Sign up to save your podcasts
Or
This academic paper explores why large language models (LLMs) both generalize correctly and "hallucinate" incorrect information when fine-tuned with new facts. The authors propose that out-of-context reasoning (OCR) is the single underlying mechanism responsible for both phenomena. They demonstrate through experiments on five prominent LLMs that OCR drives generalization when concepts are causally related and hallucination when they are not. Furthermore, the research formalizes OCR as a synthetic factual recall task, revealing that a factorized model architecture in a one-layer transformer enables generalization by promoting an implicit bias during gradient descent that favors solutions minimizing the nuclear norm of combined matrices. Conversely, a non-factorized model fails to generalize, highlighting the critical role of matrix factorization in LLMs' ability to associate facts and implications, irrespective of causal links.
keepSave to notecopy_alldocsAdd noteaudio_magic_eraserAudio OverviewflowchartMind Map
This academic paper explores why large language models (LLMs) both generalize correctly and "hallucinate" incorrect information when fine-tuned with new facts. The authors propose that out-of-context reasoning (OCR) is the single underlying mechanism responsible for both phenomena. They demonstrate through experiments on five prominent LLMs that OCR drives generalization when concepts are causally related and hallucination when they are not. Furthermore, the research formalizes OCR as a synthetic factual recall task, revealing that a factorized model architecture in a one-layer transformer enables generalization by promoting an implicit bias during gradient descent that favors solutions minimizing the nuclear norm of combined matrices. Conversely, a non-factorized model fails to generalize, highlighting the critical role of matrix factorization in LLMs' ability to associate facts and implications, irrespective of causal links.
keepSave to notecopy_alldocsAdd noteaudio_magic_eraserAudio OverviewflowchartMind Map