Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI Safety Camp 2024, published by Linda Linsefors on November 18, 2023 on The AI Alignment Forum.
AI Safety Camp connects you with a research lead to collaborate on a project - to see where your work could help ensure future AI is safe.
Apply before December 1, to collaborate online from January to April 2024.
We value diverse backgrounds. Many roles but definitely
not all require some knowledge in one of: AI safety, mathematics or machine learning.
Some skills requested by various projects:
Art, design, photography
Humanistic academics
Communication
Marketing/PR
Legal expertise
Project management
Interpretability methods
Using LLMs
Coding
Math
Economics
Cybersecurity
Reading scientific papers
Know scientific methodologies
Think and work independently
Familiarity of AI risk research landscape
Projects
To not build uncontrollable AI
Projects to restrict corporations from recklessly scaling the training and uses of ML models. Given controllability limits.
1. Towards realistic ODDs for foundation model based AI offerings
2. Luddite Pro: information for the refined luddite
3. Lawyers (and coders) for restricting AI data laundering
4. Assessing the potential of congressional messaging campaigns for AIS
Everything else
Diverse other projects, including technical control of AGI in line with human values.
Mech-Interp
5. Modelling trajectories of language models
6. Towards ambitious mechanistic interpretability
7. Exploring toy models of agents
8. High-level mechanistic interpretability and activation engineering library
9. Out-of-context learning interpretability
10. Understanding search and goal representations in transformers
Evaluating and Steering Models
11. Benchmarks for stable reflectivity
12. SADDER: situational awareness datasets for detecting extreme risks
13. TinyEvals: how language models speak coherent English?
14. Evaluating alignment evaluations
15. Pipelines for evaluating and steering LLMs towards faithful reasoning
16. Steering of LLMs through addition of activation vectors with latent ethical valence
Agent Foundations
17. High actuation spaces
18. Does sufficient optimization imply agent structure?
19. Discovering agents in raw bytestreams
20. The science algorithm
Miscellaneous Alignment Methods
21. SatisfIA - AI that satisfies without overdoing it
22. How promising is automating alignment research? (literature review)
23. Personalized fine-tuning token for AI value alignment
24. Self-other overlap @AE Studio
25. Asymmetric control in LLMs: model editing and steering that resists control for unalignment
26. Tackling key challenges in Debate
Other
27. AI-driven economic safety nets: restricting the macroeconomic disruptions of AGI deployment
28. Policy-based access to powerful models
29. Organise the next Virtual AI Safety Unconference
Please write your application with the research lead of your favorite project in mind. Research leads will directly review applications this round. We organizers will only assist when a project receives an overwhelming number of applications.
Apply now
Apply if you…
want to consider and try out roles for helping ensure future AI function safely;
are able to explain why and how you would contribute to one or more projects;
previously studied a topic or trained in skills that can bolster your new team's progress;
can join weekly team calls and block out 5 hours of work each week from January to April 2024.
Timeline
Applications
By 1 Dec: Apply. Fill in the questions doc and submit it through the form.
Dec 1-22: Interviews. You may receive an email for an interview, from one or more of the research leads whose project you applied for.
By 28 Dec: Final decisions. You will definitely know if you are admitted. Hopefully we can tell you sooner, but we pinky-swear we will by 28 Dec.
Program
Jan 13-14: Opening weekend. First meeting ...