Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI Alignment Research Engineer Accelerator (ARENA): call for applicants, published by TheMcDouglas on April 17, 2023 on LessWrong.
TL;DR
Apply here for the second iteration of ARENA!
Introduction
We are excited to announce the second iteration of ARENA (Alignment Research Engineer Accelerator), a 6-week ML bootcamp with a focus on AI safety. Our mission is to prepare participants for full-time careers as research engineers in AI safety, e.g. at leading organizations or as independent researchers.
The program will commence on May 22nd, 2023, and will be held at the Moorgate WeWork offices in London. This will overlap with SERI MATS, who are also using these offices. We expect this to bring several benefits, e.g. facilitating productive discussions about AI safety & different agendas, and allowing participants to form a better picture of what working on AI safety can look like in practice.
ARENA offers a unique opportunity for those interested in AI safety to learn valuable technical skills, engage in their own projects, and make open-source contributions to AI safety-related libraries. The program is comparable to MLAB or WMLB, but extends over a longer period to facilitate deeper dives into the content, and more open-ended project work with supervision.
For more information, see our website.
Outline of Content
The 6-week program will be structured as follows:
Chapter 0 - Fundamentals
Before getting into more advanced topics, we first cover the basics of deep learning, including basic machine learning terminology, what neural networks are, and how to train them. We will also cover some subjects we expect to be useful going forwards, e.g. using GPT-3 and 4 to streamline your learning, good coding practices, and version control.
Topics include:
PyTorch basics
CNNs, Residual Neural Networks
Optimization
Backpropagation
Hyperparameter search with Weights and Biases
Model training & PyTorch Lightning
Duration: 5 days
Chapter 1 - Transformers & Mechanistic Interpretability
In this chapter, you will learn all about transformers, and build and train your own. You'll also learn about Mechanistic Interpretability of transformers, a field which has been advanced by Anthropic’s Transformer Circuits sequence, and open-source work by Neel Nanda.
Topics include:
GPT models (building your own GPT-2)
Training and sampling from transformers
TransformerLens
In-context Learning and Induction Heads
Indirect Object Identification
Superposition
Duration: 9 days
Chapter 2 - Reinforcement Learning
In this chapter, you will learn about some of the fundamentals of RL, and work with OpenAI’s Gym environment to run their own experiments.
Topics include:
Fundamentals of RL
Vanilla Policy Gradient
PPO
Deep Q-learning
RLHF
Gym & Gymnasium environments
Duration: 6 days
Chapter 3 - Training at Scale
There are a number of techniques that are helpful for training large-scale models efficiently. Here, you will learn more about these techniques and how to use them. The focus is on hands-on learning, rather than just a theoretical understanding.
Topics include:
GPUs
Distributed computing
Data/tensor/pipeline parallelism
Finetuning LLMs
Duration: 4 days
Chapter 4 - Capstone Projects
We will conclude this program with capstone projects, where participants get to dig into something related to the course. This should draw on much of the skills and knowledge participants will have accumulated over the last 5 weeks.
Duration: 6 days
Below is a diagram of the curriculum as a whole, and the dependencies between sections.
Here is some sample material from the course, which you will be able to full understand once you reach that point in the course. This notebook is on Indirect Object Identification (from the chapter on Transformers & Mechanistic Interpretability), it will represent one of a set of optional 2-day mi...