
Sign up to save your podcasts
Or


This research investigates the location of task recognition within Large Language Models (LLMs) during in-context learning. By employing layer-wise context masking on various LLMs and tasks (Machine Translation and Code Generation), the study identifies a "task recognition" point where the model no longer needs attention to the input context. The findings indicate potential for computational savings by reducing redundant processing and reveal a correspondence between this task recognition point and effective layers for parameter-efficient fine-tuning. The paper characterizes a three-phase process of in-context learning and explores the roles of instructions and examples, suggesting task recognition happens primarily in middle layers of the network.
By Enoch H. KangThis research investigates the location of task recognition within Large Language Models (LLMs) during in-context learning. By employing layer-wise context masking on various LLMs and tasks (Machine Translation and Code Generation), the study identifies a "task recognition" point where the model no longer needs attention to the input context. The findings indicate potential for computational savings by reducing redundant processing and reveal a correspondence between this task recognition point and effective layers for parameter-efficient fine-tuning. The paper characterizes a three-phase process of in-context learning and explores the roles of instructions and examples, suggesting task recognition happens primarily in middle layers of the network.