The Nonlinear Library

By The Nonlinear Fund

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio conte... more

· Education

4.6

88 ratings

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about The Nonlinear Library:

How many episodes does The Nonlinear Library have?

The podcast currently has 9,862 episodes available.

The Nonlinear Library episodes:

February 05, 2023 AF - SolidGoldMagikarp (plus, prompt generation) by Jessica Rumbelow
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: SolidGoldMagikarp (plus, prompt generation), published by Jessica Rumbelow on February 5, 2023 on The AI Alignment Forum.
Work done at SERI-MATS, over the past two months, by Jessica Rumbelow and Matthew Watkins.
TL;DR
Anomalous tokens: a mysterious failure mode for GPT (which reliably insulted Matthew)
We have found a set of anomalous tokens which result in a previously undocumented failure mode for GPT-2 and GPT-3 models. (The 'instruct' models “are particularly deranged” in this context, as janus has observed.)
Many of these tokens reliably break determinism in the OpenAI GPT-3 playground at temperature 0 (which theoretically shouldn't happen).
Prompt generation: a new interpretability method for language models (which reliably finds prompts that result in a target completion). This is good for:
eliciting knowledge
generating adversarial inputs
automating prompt search (e.g. for fine-tuning)
In this post, we'll introduce the prototype of a new model-agnostic interpretability method for language models which reliably generates adversarial prompts that result in a target completion. We'll also demonstrate a previously undocumented failure mode for GPT-2 and GPT-3 language models, which results in bizarre completions (in some cases explicitly contrary to the purpose of the model), and present the results of our investigation into this phenomenon. Further detail can be found in a follow-up post.
Prompt generation
First up, prompt generation. An easy intuition for this is to think about feature visualisation for image classifiers (an excellent explanation here, if you're unfamiliar with the concept).
We can study how a neural network represents concepts by taking some random input and using gradient descent to tweak it until it it maximises a particular activation. The image above shows the resulting inputs that maximise the output logits for the classes 'goldfish', 'monarch', 'tarantula' and 'flamingo'. This is pretty cool! We can see what VGG thinks is the most 'goldfish'-y thing in the world, and it's got scales and fins. Note though, that it isn't a picture of a single goldfish. We're not seeing the kind of input that VGG was trained on. We're seeing what VGG has learned. This is handy: if you wanted to sanity check your goldfish detector, and the feature visualisation showed just water, you'd know that the model hadn't actually learned to detect goldfish, but rather the environments in which they typically appear. So it would label every image containing water as 'goldfish', which is probably not what you want. Time to go get some more training data.
So, how can we apply this approach to language models?
Some interesting stuff here. Note that as with image models, we're not optimising for realistic inputs, but rather for inputs that maximise the output probability of the target completion, shown in bold above.
So now we can do stuff like this:
And this:
We'll leave it to you to lament the state of the internet that results in the above optimised inputs for the token ' girl'.
How do we do this? It's tricky, because unlike pixel values, the inputs to LLMs are discrete tokens. This is not conducive to gradient descent. However, these discrete tokens are mapped to embeddings, which do occupy a continuous space, albeit sparsely. (Most of this space doesn't correspond actual tokens – there is a lot of space between tokens in embedding space, and we don't want to find a solution there.) However, with a combination of regularisation and explicit coercion to keep embeddings close to the realm of legal tokens during optimisation, we can make it work. Code available here if you want more detail.
This kind of prompt generation is only possible because token embedding space has a kind of semantic coherence. Semantically related tokens tend to be found close together. We discov...
...more
24min
February 05, 2023 LW - SolidGoldMagikarp (plus, prompt generation) by Jessica Rumbelow
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: SolidGoldMagikarp (plus, prompt generation), published by Jessica Rumbelow on February 5, 2023 on LessWrong.
Work done at SERI-MATS, over the past two months, with Matthew Watkins.
TL;DR:
Anomalous Tokens: a mysterious failure mode for GPT (which reliably insulted my colleague Matthew)
We have found a set of anomalous tokens which result in a previously undocumented failure mode for GPT2 and GPT3 models. (The instruct models are particularly deranged.)
It also appears to break determinism in the playground at temperature 0, which shouldn't happen.
Prompt Generation: a new interpretability method for language models (which reliably finds prompts that result in a target completion).
Good for eliciting knowledge
Generating adversarial inputs
Automating prompt search (e.g. for fine-tuning)
In this post, we'll introduce the prototype of a new model-agnostic interpretability method for language models which reliably generates adversarial prompts that result in a target completion. We'll also demonstrate a previously undocumented failure mode for GPT-2 and GPT-3 language models, which results in bizarre completions (in some cases explicitly contra to the purpose of the model), and present the results of our investigation into this phenomenon.
First up, prompt generation. An easy intuition for this is to think about feature visualisation for image classifiers (an excellent explanation here, if you're unfamiliar with the concept).
We can study how a neural network represents concepts by taking some random input and using gradient descent to tweak it until it it maximises a particular activation. The image above shows the resulting inputs that maximise the output logits for the classes 'goldfish', 'monarch', 'tarantula' and 'flamingo'. This is pretty cool! We can see what VGG thinks is the most 'goldfish'-y thing in the world, and it's got scales and fins. Note though, that it isn't a picture of a single goldfish. We're not seeing the kind of input that VGG was trained on. We're seeing what VGG has learned. This is handy: if you wanted to sanity check your goldfish detector, and the feature visualisation showed just water, you'd know that the model hadn't actually learned to detect goldfish, but rather the environments in which they typically appear. So it would label every image containing water as 'goldfish', which is probably not what you want. Time to go get some more training data.
So, how can we apply this approach to language models?
Some interesting stuff here. Note that as with image models, we're not optimising for realistic inputs, but rather for inputs that maximise the output probability of the target completion, shown in bold above.
So now we can do stuff like this:
And this:
I'll leave it to you to lament the state of the internet that results in the above optimised inputs for the token ' girl'.
How do we do this? It's tricky, because unlike pixels, the inputs to LLMs are discrete tokens. This is not conducive to gradient descent. However, these discrete tokens are mapped to embeddings, which do occupy a continuous, albeit sparse, space. (Most of this space doesn't correspond actual tokens – there is a lot of space between tokens in embedding space, and we don't want to find a solution there.) However, with a combination of regularisation and explicit coercion to keep embeddings close to the realm of legal tokens during optimisation, we can make it work. Code available here if you want more detail.
Prompt generation is only possible because token embedding space is semantically meaningful. Related tokens are close together. We found this out by doing k-means over the embedding space of the GPT vocabulary, and found many clusters that are surprisingly robust to random initialisation of the centroids. Here are a few examples.
During this process we found some weir...
...more
11min
February 05, 2023 EA - The EA Forum should remove community posts from search-engine indexing. by devansh
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The EA Forum should remove community posts from search-engine indexing., published by devansh on February 5, 2023 on The Effective Altruism Forum.
I want to propose a common-sense reform to the EA forum - stop any post tagged "Community" from showing up in Google searches. (I believe this to be technically implementable; this not being true would be good reason not to implement this change). I think it's probably good for EA discourse right now to be able to talk about scandals etc. openly but also have, like, minimum moats preventing some of the lowest effort bad-faith targeting by external parties.
The important parts of the EA forum to people who are Googling us are, like, the things that we object-level care about! The actual stuff that the majority of EAs in direct work do every day—distributing insecticide-treated antimalarial bednets, or doing research in AI alignment, or figuring out how to make vaccines in advance for the next pandemic. There are status and incentive gradients to write about and upvote community stuff on the EA forum, but we can counter that at least somewhat by removing it from search engines!)
I also, for similar reasons, think that we should further decrease the default rate at which community posts show up on the frontpage, perhaps going as far as to mark community posts as Personal Blog by default. I think they make discourse norms worse and having the frontpage full of object-level takes about the world (which in fact actually tracks what most people doing direct EA work are actually focused on, instead of writing Forum posts!) is better for both discourse norms and, IDK, the health of the EA community. (This is not a hill that I want to die on, but feels like a relevant extension).
Finally, I find myself instinctively censoring myself on the Forum because anything I say can be adversarially quoted by a journalist attempting to take it out of context. There's not a lot I can do about that, but we could at least make it slightly harder for discussion amongst EAs that often require context about EA principles and values and community norms to be on the public internet.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
...more
3min
February 05, 2023 AF - Control by Tsvi Benson-Tilsen
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Control, published by Tsvi Benson-Tilsen on February 5, 2023 on The AI Alignment Forum.
[Metadata: crossposted from. First completed 3 July 2022.]
I don't know how to define control or even point at it except as a word-cloud, so it's probably wanting to be refactored. The point of talking about control is to lay part of the groundwork for understanding what determines what directions a mind ends up pushing the world in. Control is something like what's happening when values or drives are making themselves felt as values or drives. ("Influence" = "in-flow" might be a better term than "control".)
Previous: Structure, creativity, and novelty
Definitions of control
Control is when an element makes another element do something. This relies on elements "doing stuff".
Control is when an element {counterfactually, evidentially, causally, logically...} determines {the behavior, the outcome of the behavior} of an assembly of elements.
Control is when an element modifies the state of an element. This relies on elements having a state. Alternatively, control is when an element replaces an element with a similar element.
Control is when an element selects something according to a criterion.
These definitions aren't satisfying in part because they rely on the pre-theoretic ideas of "makes", "determines", "modifies", "selects". Those ideas could be defined precisely in terms of causality, but doing that would narrow their scope and elide some of the sense of "control". To say, pre-theoretically, "My desire for ice cream is controlling where I'm walking.", is sometimes to say "The explanation for why I'm walking along such-and-such a path, is that I'm selecting actions based on whether they'll get me ice cream, and that such-and-such a path leads to ice cream.", and explanation in general doesn't have to be about causality. Control is whatever lies behind the explanations given in answer to questions like "What's controlling X?" and "How does Y control Z?" and "How can I control W?".
Another way the above definitions are unsatisfactory is that they aren't specific enough; some of them would say that if I receive a message and then update my beliefs according to an epistemic rule, that message controls me. That might be right, but it's a little counterintuitive to me.
There's a tension between describing the dynamics of a mind--how the parts interact over time--vs. describing the outcomes of a mind, which is more easily grasped with gemini modeling of "desires". (I.e. by having your own copy of the "desire" and your own machinery for playing out the same meaning of the "desire" analogously to the original "desire" in the original mind.) I'm focusing on dynamical concepts because they seem more agnostic as discussed above, but it might be promising to instead start with presumptively unified agency and then distort / modify / differentiate / deform / vary the [agency used to gemini model a desire] to allow for modeling less-presumptively-coherent control. (For discussion of the general form of this "whole->wholes" approach, distinct from the "parts->wholes" approach, see Non-directed conceptual founding.) Another definition of control in that vein, a variation on a formula from Sam Eisenstat:
Control is an R-stable relationship between an R-stable element and R-unstable prior/posterior elements (which therefore play overlapping roles). "R-stable" means stable under ontological Revolutions. That is, we have C(X,Y) and C(X,Z), where X and C are somehow the same before and after an ontological revolution, and Y and Z aren't the same.
Control vs. values
I'm talking about control rather than "values" because I don't want to assume:
that there are terminal values,
that there's a clear distinction between terminal values and non-terminal values,
that there are values stable across time and m...
...more
18min
February 05, 2023 EA - EA's weirdness makes it unusually susceptible to bad behavior by OutsideView
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: EA's weirdness makes it unusually susceptible to bad behavior, published by OutsideView on February 5, 2023 on The Effective Altruism Forum.
Writing this under a fresh account because I don't want my views on this impact career opportunities.
TLDR: We're all aware that EA has been rocked by a series of high profile scandals recently. I believe EA is more susceptible to these kinds of scandals than most movements because EA fundamentally has a very high tolerance for deeply weird people. This tolerance leads to more acceptance of socially unacceptable behavior than would otherwise be permitted.
It seems uncontroversial and obviously true to me that EA is deeply fucking weird. It's easy to forget once you're inside the community, but even the basics like "Do some math to see how much good our charitable dollars do" is an unusual instinct for most regular people. Extending that into "Donate your money to save African people from diseases" is very weird for most regular people. Extending further into other 'mainstream EA' cause areas (like AI safety) ups the weird factor by several orders of magnitude. The work that many EAs do seems fundamentally bizarre to much/most of the world.
Ideas that most of the world would find patently insane - that we should care about shrimp welfare, insect welfare, trillions 0f future em-style beings - are regularly discussed, taken seriously, and given funding and institutional weight in EA. Wildly unusual social practices like polyamory are common and other unusual practices like atheism and veganism are outright the default. Anyone who's spent any amount of time in EA can probably tell you about some very odd people they've met: whether it's a guy who only wears those shoes with individual toes, or the girl who does taxidermy for fun and wants to talk to you about it for the next several hours, or the the guy who doesn't believe in showers. I don't have hard numbers but I am sure the EA community over-indexes like mad for those on the autism spectrum.
This movement might have the one of the highest 'weirdness tolerance' factors of all extant movements today.
This has real consequences, good and bad. Many of you have probably jumped to one of the good parts: if you want to generate new ideas, you need weirdos. There are benefits to taking in misfits and people with idiosyncratic ideas and bizarre behaviors, because sometimes those are the people with startlingly valuable new insights. This is broadly true. There are a lot of people doing objectively weird things in EA who are good, smart, kind, interesting and valuable thinkers, and who are having a positive impact on the world. I've met and admire many of them. If EA is empowering these folks to flex their weirdness for good, then I'm glad.
But there are downsides as well. If there's a big dial where one end is 'Be Intolerant Of Odd People' and one end is 'Be Tolerant of Odd People' and you crank it all the way to 100% tolerance, you're going to end up with more than just the helpful kind weirdos. You're going to end up with creeps and unhelpful, poisonous weirdos as well. You're going to end up with the people who casually invite coworkers to go to sex parties with them to experiment with BDSM toys. You're going to end up with people who say that "pedophilic relationships between very young women and older men are a good way to transfer knowledge" and also people whose first instinct is to defend such a statement as "high decoupling cognitive style". People whose reaction to accusations of misconduct is to build a probability model and try to set an 'acceptableness threshold'. You know what should worry EA?
I was not the least bit surprised to see so many accusations of wildly inappropriate workplace behavior or semantic games defending abhorrent ideas/people. I thought 'yeah seems like...
...more
7min
February 05, 2023 EA - Appreciation thread Feb 2023 by Michelle Hutchinson
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Appreciation thread Feb 2023, published by Michelle Hutchinson on February 5, 2023 on The Effective Altruism Forum.
It feels like it’s been a while since we had an appreciation thread. At a tough time for the community I thought that might be especially nice for all of us. I get a lot out of gratitude journaling, but I also really enjoy hearing what other people are appreciative of, big and small. I’ll try to get us started with a few things of different scopes and subject matters, to hopefully make it as easy as possible for others to chip in with whatever they feel appreciative of today!
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
...more
1min
February 05, 2023 EA - H5N1 - thread for information sharing, planning, and action by MathiasKB
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: H5N1 - thread for information sharing, planning, and action, published by MathiasKB on February 5, 2023 on The Effective Altruism Forum.
Hi everyone,
I've been reading up on H5N1 this weekend, and I'm pretty concerned. Right now my estimate hunch is that there is a 5% non-zero chance that it will cost more than 10,000 people their lives.
To be clear, I think it is unlikely that H5N1 will become a pandemic anywhere close to the size of covid.
Nevertheless, I think our community should be actively following the news and start thinking about ways to be helpful if the probability increases. I am creating this thread as a place where people can discuss and share information about H5N1. We have a lot of pandemic experts in this community, do chime in!
Resources
Articles
(paper showing H5N1 has spread to minks, which is my primary cause for concern)
(widely shared, but I'm unsure how much to trust the claims)
Markets
Manifold
Group of H5N1 manifold markets:
Metaculus
Plan for action
Fight status quo bias
In January 2020, many in the effective altruism and rationalist communities had correctly gauged the seriousness of the pandemic threat and were warning people publicly about it. Despite being convinced it was likely to become a pandemic I almost entirely failed to act beyond a few symbolic gestures such as stocking up on food/masks and warning relatives.
I consider this to have been the biggest personal failing of my life. I could have started initiatives to organize and prepare, I could have invested in mRNA producers, I could have researched how it would affect third-world hospitals. Yet all I did was sit idly by and doom scroll the internet for news about covid.
My goal with this thread is to avoid making that mistake ever again, even if it means most likely looking really stupid in a few months time.
How can we lower the chance of a serious pandemic?
I encourage everyone to think about actionable steps and be ambitious in their thinking. As far as I understand mink-to-human transmission is currently the primary reason to be concerned. What ways are there to minimize the chance of this occuring?
The following companies currently own vaccines for H5N1:
Sanofi SAAflunovGSK plcQ-Pan H5N1 influenza vaccineCSL LimitedAudenz (and 1-3 more I think?)Roche Holding AG Genussscheineoseltamivir (aka Tamiflu, not a vaccine), this one seems less useful than the others
Could we pay them to start scaling up production tomorrow? One thing to note is that all these vaccines are egg-based. Are mRNA vaccines possible to create for this? If so, what can we do to speed up the process of making them?
Any other ideas?
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
...more
3min
February 05, 2023 EA - Should EVF consider appointing new board members? by BurnerAcct
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Should EVF consider appointing new board members?, published by BurnerAcct on February 5, 2023 on The Effective Altruism Forum.
From Announcing Interim CEOs of EVF:
The EVF UK board consists of Will MacAskill, Tasha McCauley, Claire Zabel, Owen Cotton-Barratt, and Nick Beckstead. The EVF US board consists of Nick Beckstead, Rebecca Kagan, and Nicole Ross. Given their ties to the FTX Foundation and Future Fund, Will MacAskill and Nick Beckstead are recused from discussions and decision-making that relate to FTX,[4] as they have been since early November.
Will MacAskill and Nick Beckstead had significant enough ties to FTX to be recused from EVF FTX-related decision-making, a significant and legally complex element of the boards' current responsibilities.
Claire Zabel oversees significant grant-making to EVF organizations through her role at Open Phil, some of which have come under fire. While it is common for funders to serve on boards, it is not necessarily best practice.
Nicole Ross is an employee of EVF organization CEA, where she serves as Head of Community Health and Special Projects. It is atypical for non-executive employees to serve on boards where they have oversight and control over their own managers.
I do not know relevant details regarding McCauley, Cotton-Barratt, or Kagan.
All board members are, to my knowledge, European and American.
All listed are, to my knowledge, reputable and generally ethical individuals. However, these connections represent a larger intermingling in EA that is concerning and representative of a culture rife with conflicts of interest. Should EVF consider appointing new board members?
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
...more
2min
February 05, 2023 LW - Evaluations (of new AI Safety researchers) can be noisy by LawrenceC
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Evaluations (of new AI Safety researchers) can be noisy, published by LawrenceC on February 5, 2023 on LessWrong.
Related work: Hero Licensing, Modest Epistemology, The Alignment Community is Culturally Broken, Status Regulation and Anxious Underconfidence, Touch reality as soon as possible, and many more.
TL;DR: Evaluating whether or not someone will do well at a job is hard, and evaluating whether or not someone has the potential to be a great AI safety researcher is even harder. This applies to evaluations from other people (e.g. job interviews, first impressions at conferences) but especially to self-evaluations. Performance is also often idiosyncratic: people who do poorly in one role may do well in others, even superficially similar ones. As a result, I think people should not take rejections or low self confidence so seriously, and instead try more things and be more ambitious in general.
Epistemic status: This is another experiment in writing fast as opposed to carefully. (Total time spent: ~4 hours) I think this probably also applies in general, but I’m much less sure than in the case of AI research. As always, the law of equal and opposite advice applies. It’s okay to take it easy, and to do what you need to do to recover. I also don’t think that everyone should aim to be an AI safety researcher – my focus is on this field because it’s what I’m most familiar with. If you’ve found something else you’re good at, you probably should keep doing it. Please don’t injure yourself using this advice.
Acknowledgements: Thanks to Beth Barnes for inspiring this post and contributing her experiences in the addendum, and to Adrià Garriga-Alonso, Erik Jenner, Rachel Freedman, and Adam Gleave for feedback.
Introduction: evaluating skill is hard, and most evaluations are done via proxies
I think people in the LessWrong/Alignment Forum space tend to take negative or null evaluations of themselves too seriously. For example, I’ve spoken to a few people who gave up on AI Safety after being rejected from SERI MATS and REMIX; I’ve also spoken to far too many people who are too scared to apply for any position in technical research after having a single negative interaction with a top researcher at a conference. While I think people should be free to give up whenever they want, my guess is that most people internalize negative evaluations too much, and would do better if they did less fretting and more touching reality.
Fundamentally, this is because evaluations of new researchers are noisier than you think. Interviews and applications are not always indicative of the applicant’s current skill. First impressions, even from top researchers, do not always reflect reality. People can perform significantly differently in different work environments, so failing at a single job does not mean that you are incompetent. Most importantly, people can and do improve over time with effort.
In my experience, a lot of updating so hard on negative examples comes from something like anxious underconfidence as opposed to reasoned arguments. It’s always tempting to confirm your own negative evaluations of yourself. And if you’re looking for reasons why you’re not “good enough” in order to handicap yourself, being convinced that one particular negative evaluation is not the end of the world will just make you overupdate on a different negative evaluation. Accordingly, I think it’s important to take things a little less seriously, be willing to try more things, and let your emotions more accurately reflect your situation.
Of course, that’s not to say that you should respond to any negative sign by pushing yourself even harder; it’s okay to take time to recover when things don’t go well. But I strongly believe that people in the community give up a bit too easily, and are a bit too scared to apply to ...
...more
24min
February 05, 2023 LW - Podcast with Oli Habryka on LessWrong / Lightcone Infrastructure by DanielFilan
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Podcast with Oli Habryka on LessWrong / Lightcone Infrastructure, published by DanielFilan on February 5, 2023 on LessWrong.
OK sorry to over-advertise but it seemed like this one would be of interest to the LessWrong community. Episode description below, audio is here, or search for "The Filan Cabinet Habryka" wherever you listen to podcasts.
In this episode I speak with Oliver Habryka, head of Lightcone Infrastructure, the organization that runs the internet forum LessWrong, about his projects in the rationality and existential risk spaces. Topics we talk about include:
How did LessWrong get revived?
How good is LessWrong?
Is there anything that beats essays for making intellectual contributions on the internet?
Why did the team behind LessWrong pivot to property development?
What does the FTX situation tell us about the wider LessWrong and Effective Altruism communities?
What projects could help improve the world's rationality?
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
...more
2min

FAQs about The Nonlinear Library:

How many episodes does The Nonlinear Library have?

The podcast currently has 9,862 episodes available.