
Sign up to save your podcasts
Or


SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions introduces a novel framework for improving the instruction-following capabilities of pretrained language models using minimal human-labeled data.
Large language models typically depend heavily on human-written instruction datasets to learn how to follow prompts zero-shot. However, creating this human-annotated data is costly and often lacks the diversity and creativity needed to cover a wide variety of tasks, which bottlenecks the model's ability to generalize.
To solve this, the authors propose SELF-INSTRUCT, a semi-automated pipeline that bootstraps instruction data directly from the language model itself. The process begins with a small seed pool of 175 human-written tasks and uses the model to iteratively execute four steps:
Key Results:By applying this pipeline to a vanilla GPT-3 model, the researchers generated a diverse synthetic dataset of over 52,000 instructions and 82,000 instances. When GPT-3 was finetuned on this self-generated data (creating a model called GPT3SELF-INST), its zero-shot performance on the SUPER-NATURALINSTRUCTIONS benchmark improved by 33% over the original model. Furthermore, human evaluations on a newly curated set of 252 complex, user-oriented tasks showed that GPT3SELF-INST outperformed models trained on other public instruction datasets and performed nearly on par with InstructGPT001, which relies on private user data and expensive human annotations.
By Yun WuSELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions introduces a novel framework for improving the instruction-following capabilities of pretrained language models using minimal human-labeled data.
Large language models typically depend heavily on human-written instruction datasets to learn how to follow prompts zero-shot. However, creating this human-annotated data is costly and often lacks the diversity and creativity needed to cover a wide variety of tasks, which bottlenecks the model's ability to generalize.
To solve this, the authors propose SELF-INSTRUCT, a semi-automated pipeline that bootstraps instruction data directly from the language model itself. The process begins with a small seed pool of 175 human-written tasks and uses the model to iteratively execute four steps:
Key Results:By applying this pipeline to a vanilla GPT-3 model, the researchers generated a diverse synthetic dataset of over 52,000 instructions and 82,000 instances. When GPT-3 was finetuned on this self-generated data (creating a model called GPT3SELF-INST), its zero-shot performance on the SUPER-NATURALINSTRUCTIONS benchmark improved by 33% over the original model. Furthermore, human evaluations on a newly curated set of 252 complex, user-oriented tasks showed that GPT3SELF-INST outperformed models trained on other public instruction datasets and performed nearly on par with InstructGPT001, which relies on private user data and expensive human annotations.