The Nonlinear Library: Alignment Forum

AF - Can startups be impactful in AI safety? by Esben Kran


Listen Later

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Can startups be impactful in AI safety?, published by Esben Kran on September 13, 2024 on The AI Alignment Forum.
With
Lakera's strides in securing LLM APIs,
Goodfire AI's path to scaling interpretability, and 20+
model
evaluations
startups among
much
else, there's a rising number of technical startups attempting to secure the model ecosystem.
Of course, they have varying levels of impact on superintelligence containment and security and even with these companies, there's a lot of potential for aligned, ambitious and high-impact startups within the ecosystem. This point isn't new and has been made in our
previous
posts and by
Eric Ho (Goodfire AI CEO).
To set the stage, our belief is that these are the types of companies that will have a positive impact:
Startups with a profit incentive completely aligned with improving AI safety;
that have a deep technical background to shape AGI deployment and;
do not try to compete with AGI labs.
Piloting AI safety startups
To understand impactful technical AI safety startups better, Apart Research joined forces with collaborators from Juniper Ventures, vectorview (alumni from the latest YC cohort), Rudolf (from the upcoming def/acc cohort), Tangentic AI, and others. We then invited researchers, engineers, and students to resolve a key question "can we come up with ideas that scale AI safety into impactful for-profits?"
The hackathon took place during a weekend two weeks ago with a keynote by
Esben Kran (co-director of Apart) along with 'HackTalks' by
Rudolf Laine (def/acc) and
Lukas Petersson (YC / vectorview). Individual submissions were a 4 page report with the problem statement, why this solution will work, what the key risks of said solution are, and any experiments or demonstrations of the solution the team made.
This post details the top 6 projects and excludes 2 projects that were made private by request (hopefully turning into impactful startups now!). In total, we had 101 signups and 11 final entries. Winners were decided by an LME model conditioned on reviewer bias. Watch the authors' lightning talks
here.
Dark Forest: Making the web more trustworthy with third-party content verification
By Mustafa Yasir (AI for Cyber Defense Research Centre, Alan Turing Institute)
Abstract: 'DarkForest is a pioneering Human Content Verification System (HCVS) designed to safeguard the authenticity of online spaces in the face of increasing AI-generated content. By leveraging graph-based reinforcement learning and blockchain technology, DarkForest proposes a novel approach to safeguarding the authentic and humane web. We aim to become the vanguard in the arms race between AI-generated content and human-centric online spaces.'
Content verification workflow supported by graph-based RL agents deciding verifications
Reviewer comments:
Natalia: Well explained problem with clear need addressed. I love that you included the content creation process - although you don't explicitly address how you would attract content creators to use your platform over others in their process. Perhaps exploring what features of platforms drive creators to each might help you make a compelling case for using yours beyond the verification capabilities.
I would have also liked to see more details on how the verification decision is made and how accurate this is on existing datasets.
Nick: There's a lot of valuable stuff in here regarding content moderation and identity verification. I'd narrow it to one problem-solution pair (e.g.,
"jobs to be done") and focus more on risks around early product validation (deep interviews with a range of potential users and buyers regarding value) and go-to-market. It might also be worth checking out
Musubi.
Read the full project here.
Simulation Operators: An annotation operation for alignment of robot
By Ardy Haroen (USC)
Abstrac...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear Library: Alignment ForumBy The Nonlinear Fund


More shows like The Nonlinear Library: Alignment Forum

View all
AXRP - the AI X-risk Research Podcast by Daniel Filan

AXRP - the AI X-risk Research Podcast

8 Listeners