Learning GenAI via SOTA Papers

EP047: Bootstrapping AI With Self-Generated Instructions


Listen Later

SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions introduces a novel framework for improving the instruction-following capabilities of pretrained language models using minimal human-labeled data.

Large language models typically depend heavily on human-written instruction datasets to learn how to follow prompts zero-shot. However, creating this human-annotated data is costly and often lacks the diversity and creativity needed to cover a wide variety of tasks, which bottlenecks the model's ability to generalize.

To solve this, the authors propose SELF-INSTRUCT, a semi-automated pipeline that bootstraps instruction data directly from the language model itself. The process begins with a small seed pool of 175 human-written tasks and uses the model to iteratively execute four steps:

  1. Instruction Generation: The model generates new task instructions based on a sample of existing ones.
  2. Classification Task Identification: The model determines if the new instruction requires a classification output or not.
  3. Instance Generation: The model generates input-output instances for the task using either an input-first approach (for non-classification tasks) or an output-first approach (to prevent biased labels in classification tasks).
  4. Filtering: Heuristics are used to filter out invalid, low-quality, or highly repetitive instructions before adding the successful tasks back into the pool.

Key Results:By applying this pipeline to a vanilla GPT-3 model, the researchers generated a diverse synthetic dataset of over 52,000 instructions and 82,000 instances. When GPT-3 was finetuned on this self-generated data (creating a model called GPT3SELF-INST), its zero-shot performance on the SUPER-NATURALINSTRUCTIONS benchmark improved by 33% over the original model. Furthermore, human evaluations on a newly curated set of 252 complex, user-oriented tasks showed that GPT3SELF-INST outperformed models trained on other public instruction datasets and performed nearly on par with InstructGPT001, which relies on private user data and expensive human annotations.

...more
View all episodesView all episodes
Download on the App Store

Learning GenAI via SOTA PapersBy Yun Wu