The Nonlinear Library

By The Nonlinear Fund

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio conte... more

· Education

4.6

88 ratings

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about The Nonlinear Library:

How many episodes does The Nonlinear Library have?

The podcast currently has 9,862 episodes available.

The Nonlinear Library episodes:

September 22, 2023 LW - Fund Transit With Development by jefftk
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Fund Transit With Development, published by jefftk on September 22, 2023 on LessWrong.
When transit gets better the land around it becomes more valuable: many people would like to live next to a subway station. This means that there are a lot of public transit expansions that would make us better off, building space for people to live and work. And yet, at least in the US, we don't do very much of this. Part of it is that the benefits mostly go to whoever happens to own the land around the stations.
A different model, which you see with historical subway construction or Hong Kong's MTR, uses the increase in land value to fund transit construction. The idea is, the public transit company buys property, makes it much more valuable by building service to it, and then sells it.
While I would be pretty positive on US public transit systems adopting this model, I have trouble imagining them taking it on. Instead, consider something simpler and more distributed: private developers paying to expand public transit.
Consider the proposed Somernova
Redevelopment, in Somerville MA:
This is a proposed $3.3B 1.9M-sqft development, adjacent to the Fitchburg
Line. A train station right next to it would make a ton of sense, and could be done within the existing right of way without any tunneling. Somernova briefly mentions this idea on p283, where they say:
Introducing a new train station on campus could dramatically reduce commute times, making all of Somernova within a five minute walk from the station. We look forward to ongoing dialog about these transit possibilities with the community and advocates, ensuring we continue to explore all options for enhanced connectivity longterm.
This is pretty vague compared to the rest of the plan, which has a ton of estimates, but we can make our own. The MBTA recently completed a long and expensive project to extend the
Green Line along this right of way, which stops at Union Square. Extending it to Dane Street would require another 0.9km of track and another station. The overall Green Line extension cost $2.2B for
7.6km, or $290M/km, though this included a bunch of over-designed work that needed to be thrown away and it should have been far less. This portion is relatively simple compared to the other work, with no maintenance facility or elevated sections, though it does include three bridges and moving a substation. Accepting the $290M/km figure, though, we could estimate
$260M.
A $260M extension would raise Somernova's construction costs by under
8%, less if you include the costs of the land, and I expect would raise the value of the completed project by well more than that - rents right next to subway stations are generally a lot higher than farther away. So even though Somernova would not capture all of the benefits of the new station they would capture enough to come out ahead.
This isn't a new idea: in
2011 the Assembly Row developers made a deal with the MBTA to fund an infill station for their development. Because this was just a station it was cheaper: $15M from the developer and $16M from the federal government.
Another place where something like this could make sense is building housing at Route 16. The other branch of the Green Line Extension, along the Lowell
Line, could be extended 1.4km to Route
16. Figuring the same $290M/km this would be $400M, though as a straight-forward project in an existing right of way it should be possble to do it for about half that. Next to the site is a liquor store and supermarket, about 150k sqft:
Let's say you build ground-floor retail (with more than enough room for the current tenants) and many stories of housing above it. It's not currently zoned for this, but zoning is often dependent on transit access and this is something the city could fix (ex: Assembly
Square got special zoning). A hard...
...more
6min
September 22, 2023 LW - Let's talk about Impostor syndrome in AI safety by Igor Ivanov
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Let's talk about Impostor syndrome in AI safety, published by Igor Ivanov on September 22, 2023 on LessWrong.
Intro
Impostor syndrome is quite common among people working in the AI safety field. It's quite a unique field. It's extremely important, its challenges are immensely complicated, and it attracts a lot of exceptionally smart and capable people, so the bar for everyone in the field is very high. Moreover, AI safety field experiences a surge of new talents, much more than it can to absorb, so it becomes even more competitive. I'm a psychotherapist helping people working on AI safety, and in this post I describe causes and manifestations of Impostor syndrome among people from AI safety community, as well as ideas on how to overcome it. This is a post a part of my series about mental health and AI safety.
The story of impostor syndrome of AI safety researcher
Meet Ezra.
He recently finished his masters degree in public policy. After ChatGPT release he became anxious about x-risks, and started feel that he must do something. He starts reading LessWrong, listening podcasts and he quickly realizes that most people in the field are exceptionally smart.
He asks himself "Am I good enough? Can I compete with these people?" He feels intimidated and hesitate to take action.
Finally he got accepted at fellowship in a major AI governance organization. Most of his peers graduated from top universities like Harvard or Cambridge, so he feels even more insecurity about his ability to keep up with them.
On his first day, Ezra attends a meeting about upcoming work. He has ideas, but afraid to look stupid or naive, so he remains silent just to avoid drawing attention.
He is assigned to do research on AI regulation in China. This is a new topic for him, so he has many questions, but he is afraid to ask them, fearing his supervisor will decide that he is underqualified.
Instead, he spends endless hours searching for answers online. He works 80 hours a week. He triple-checks everything. Before presenting his results, he can't stop correcting his slides to the last moment, and after presenting his work, he looks closely at his colleagues' facial expressions for approval or disapproval. Even when his supervisor says "Good job", Ezra believes that he is just polite.
Causes for Impostor syndrome
Ezra focuses all his attention on making sure that others approve him and his job. He believes that he is worse than others, and he is afraid to fail, but there is no way to be 100% sure that they will be satisfied, so there is always a chance that they won't. This means that there is always a room for improvement, and for him the result is never good enough.
The other cause for Impostor syndrome stems from childhood. For example, people with Impostor syndrome might have demanding parents, who showed care and approval only if the child achieves something impressive. Or, parents convinced the child that he is worse than others.
The last problem is out of the scope of this post. I just want to mention, that this might be tricky to untangle alone, and professional mental health help is a good way to solve it.
How to overcome Impostor syndrome
Focusing on things that one can control
Erza focuses his attention on impressing others. He can influence them to some extent, but at end of the day, those are their impressions in their heads, and it's outside of Erza's control. If he instead focused his attention on something that is in his control, this might help reduce his anxiety.
Shat are examples of such goals?
Ezra can control his professional growth. Journaling is a great way to become more mindful about this.
For example, while doing research he might notice that in a last month he has significantly improved his skill of catching low-quality research papers. Now he is way better at noticing poor stat...
...more
6min
September 22, 2023 LW - Neel Nanda on the Mechanistic Interpretability Researcher Mindset by Michaël Trazzi
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Neel Nanda on the Mechanistic Interpretability Researcher Mindset, published by Michaël Trazzi on September 22, 2023 on LessWrong.
Some excerpts from my interview with Neel Nanda about how to productively carry out research in mechanistic interpretability.
Posting this here since I believe his advice is relevant for building accurate world models in general.
An Informal Definition Of Mechanistic Interpretability
It's kind of this weird flavor of AI interpretability that says, "Bold hypothesis. Despite the entire edifice of established wisdom and machine learning, saying that these models are bullshit, inscrutable black boxes, I'm going to assume there is some actual structure here. But the structure is not there because the model wants to be interpretable or because it wants to be nice to me. The structure is there because the model learns an algorithm, and the algorithms that are most natural to express in the model's structure and its particular architecture and stack of linear algebra are algorithms that make sense to humans. (context)
Three Modes Of Mechanistic Interpretability Research: Confirming, Red Teaming And Gaining Surface Area
I kind of feel a lot of my research style is dominated by this deep seated conviction that models are comprehensible and that everything is fundamentally kind of obvious and that I should be able to just go inside the model and there should be this internal structure. And so one mode of research is I just have all of these hypotheses and guesses about what's going on. I generate experiment ideas for things that should be true if my hypothesis is true. And I just repeatedly try to confirm it.
Another mode of research is trying to red team and break things, where I have this hypothesis, I do this experiment, I'm like, "oh my God, this is going so well", and then get kind of stressed because I'm concerned that I'm having wishful thinking and I try to break it and falsify it and come up with experiments that would show that actually life is complicated.
A third mode of research is what I call "trying to gain surface area" where I just have a system that I'm pretty confused about. I just don't really know where to get started. Often, I'll just go and do things that I think will get me more information. Just go and plot stuff or follow random things I'm curious about in a fairly undirected fuzzy way. This mode of research has actually been the most productive for me. [...]
You could paraphrase them as, "Isn't it really obvious what's going on?", "Oh man, am I so sure about this?" and "Fuck around and find out". (context)
Strong Beliefs Weakly Held: Having Hypotheses But Being Willing To Be Surprised
You can kind of think of it as "strong beliefs weakly held". I think you should be good enough that you can start to form hypotheses, being at the point where you can sit down, set a five minute timer and brainstorm what's going on and come up with four different hypotheses is just a much, much stronger research position than when you sit down and try to brainstorm and you come up with nothing. Yeah, maybe having two hypotheses is the best one. You want to have multiple hypotheses in mind.
You also want to be aware that probably both of them are wrong, but you want to have enough engagement with the problem that you can generate experiment ideas. Maybe one way to phrase it is if you don't have any idea what's going on, it's hard to notice what's surprising. And often noticing what's surprising is one of the most productive things you can do when doing research. (context)
On The Benefits Of The Experimental Approach
I think there is a strong trend among people, especially the kind of people who get drawn to alignment from very theory based arguments to go and just pure theory craft and play around with toy models and form beautiful, elegant hy...
...more
5min
September 22, 2023 LW - If influence functions are not approximating leave-one-out, how are they supposed to help? by Fabien Roger
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: If influence functions are not approximating leave-one-out, how are they supposed to help?, published by Fabien Roger on September 22, 2023 on LessWrong.
Thanks to Roger Grosse for helping me understand his intuitions and hopes for influence functions. This post combines highlights from some influence function papers, some of Roger Grosse's intuitions (though he doesn't agree with everything I'm writing here), and some takes of mine.
Influence functions are informally about some notion of influence of a training data point on the model's weights. But in practice, for neural networks, "influence functions" do not approximate well "what would happen if a training data point was removed". Then, what are influence functions about, and what can they be used for?
From leave-one-out to influence functions
Ideas from Bae 2022 (If influence functions are the answer, what is the question?).
The leave-one-out function is the answer to "what would happen, in a network trained to its global minima, if one point was omitted": LOO(^x,^y)=argminθ1N∑(x,y)∼D-{(^x,^y)}L(fθ(x),y)
Under some assumptions such as a strongly convex loss landscape, influence functions are cheap-to-compute approximation to leave-one-out function, thanks to the Implicit Function Theorem, which tells us that under those assumptions LOO(^x,^y)≈IF(^x,^y)def=θ∗+(∇2θJ(θ∗))-1∇θL(f(θ∗,^x),^y)/N
But these assumptions don't hold for neural networks, and Basu 2020 shows that influence functions are a terrible approximation of leave-one-out in the context of neural networks, as shown in this figure from Bae 2022 (left is for Linear Regression, where the approximation hold, right is for MultiLayer-Perceptron, where it doesn't):
Moreover, even the leave-one-out function is about parameters at convergence, which is not the regime most deep learning training runs operate in. Therefore, influence functions are even less about answering the question "what would happen if this point had more/less weight in the (incomplete) training run?".
So every time you see someone introducing influence functions as an approximation of the effect of up/down-weighting training data points (as in this LW post about interpretability), remember that this does not apply when they are applied to neural networks.
What are influence functions doing
Bae 2022 shows that influence functions (not leave-one-out!) can be well approximated by the minimization of another training objective called PBRF, which is the sum of 3 terms:
Ex∼D[L(fθ(x),fθt(x))], The loss function with the soft labels as computed by the studied function with weights after training θt: the new θ should not change the output of the function much.
1NL(fθ(^x),^y), The opposite of the loss function on the target point: the new θ should give a high loss on the considered data point
λθ-θt2, A penalization of weights very different from the final training weights (Roger told me the specific value of λ didn't have a huge influence on the result.)
This does not answer the often advertised question about leave-one-out, but this does answer something which looks related, and which happens to be much cheaper to compute than the leave-one-out function (which can only be computed by retraining the network and doesn't have cheaper approximations).
Influence functions are currently among the few options to say anything about the intuitive "influence" of individual data points in large neural networks, which justifies why they are used. (Alternatives have roughly the same kind of challenges as influence functions.)
Note: this explanation of what influence function are doing is not the only way to describe their behavior, and other works may shine new lights on what they are doing.
What are influence functions useful for
Current empirical evidence
To this date, there has been almost no work externally ...
...more
6min
September 22, 2023 LW - Immortality or death by AGI by ImmortalityOrDeathByAGI
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Immortality or death by AGI, published by ImmortalityOrDeathByAGI on September 22, 2023 on LessWrong.
AKA My Most Likely Reason to Die Young is AI X-Risk
TL;DR: I made a model which takes into account AI timelines, the probability of AI going wrong, and probabilities of dying from other causes. I got that the main "end states" for my life are either dying from AGI due to a lack of AI safety (at 35%), or surviving AGI and living to see aging solved (at 43%).
Meta: I'm posting this under a pseudonym because many people I trust had a strong intuition that I shouldn't post under my real name, and I didn't feel like investing the energy to resolve the disagreement. I'd rather people didn't de-anonymize me.
The model & results
I made a simple probabilistic model of the future, which takes seriously the possibility of AGI being invented soon, its risks, and its effects on technological development (particularly in medicine):
Without AGI, people keep dying at historical rates (following US actuarial tables)
At some point, AGI is invented (following Metaculus timelines)
At the point AGI is invented, there are two scenarios (following my estimates of humanity's odds of survival given AGI at any point in time, which are relatively pessimistic):
We survive AGI.
We don't survive AGI.
If we survive AGI, there are two scenarios:
We never solve aging (maybe because aging is fundamentally unsolvable or we decide not to solve it).
AGI is used to solve aging.
If AGI is eventually used to solve aging, people keep dying at historical rates until that point.
I model the time between AGI and aging being solved as an exponential distribution with a mean time of 5 years.
Using this model, I ran Monte Carlo simulations to predict the probability of the main end states of my life (as someone born in 2001 who lives in the US):
I die before AGI: 10%
I die from AGI: 35%
I survive AGI but die because we never solve aging: 11%
I survive AGI but die before aging is solved: 1%
I survive AGI and live to witness aging being solved: 43%
There is a jupyter notebook where you can play around with the parameters and see what the probability distribution looks like for you (scroll to the last section).
Here's what my model implies for people based on their year of birth, conditioning on them being alive in 2023:
As is expected, the earlier people are born, the likelier it is that they will die before AGI. The later someone is born, the likelier it is that they will either die from AGI or have the option to live for a very long time due to AGI-enabled advances in medicine.
Following my (relatively pessimistic) AI safety assumptions, for anyone born after ~1970, dying by AGI and having the option to live "forever" are the two most likely scenarios. Most people alive today have a solid chance at living to see aging cured. However, if we don't ensure that AI is safe, we will never be able to enter that future.
I also ran this model given less unconventional estimates of timelines and P(everyone dies | AGI), where the timelines are twice as long as the Metaculus timelines, and the P(everyone dies | AGI) is 15% in 2023 and exponentially decays at a rate where it hits 1% in 2060.
For the more conventional timelines and P(everyone dies | AGI), the modal scenarios are dying before AGI, and living to witness aging being solved. Dying from AGI hovers around 1-4% for most people.
Assumptions
Without AGI, people keep dying at historical rates
I think this is probably roughly correct, as we're likely to see advances in medicine before AGI, but nuclear and biorisk roughly counteract that (one could model how these interact, but I didn't want to add more complexity to the model). I use the US actuarial life table for men (which is very similar to the one for women) to determine the probability of dying at any particular ag...
...more
11min
September 22, 2023 LW - Atoms to Agents Proto-Lectures by johnswentworth
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Atoms to Agents Proto-Lectures, published by johnswentworth on September 22, 2023 on LessWrong.
You know the "NAND to Tetris" book/course, where one builds up the whole stack of a computer from low-level building blocks? Imagine if you had that, but rather than going from logic gates, through CPUs and compilers, to a game, you instead start from physics, go through biology and evolution, to human-like minds.
The Atoms to Agents Proto-Lectures are not that. They don't even quite aspire to that. But they aspire to one day aspire to that.
Basically, I sat down with Eli Tyre and spent a day walking through my current best understanding/guesses about the whole agency "stack", both how it works and how it evolved. The result is unpolished, full of guesswork, poorly executed (on my part), and has lots of big holes. But it's also IMO full of interesting models, cool phenomena, and a huge range of material which one rarely sees together. Lots of it is probably wrong, but wrong in ways that illuminate what answers would even look like.
The whole set of proto-lectures is on youtube here; total runtime is about 6.5 hours, broken across six videos. Below is a rough outline of topics.
Key properties of low-level physics (proto-lecture 1)
Locality
Symmetry
A program-like data structure is natural for representing locality + symmetry
Chaos (proto-lecture 2)
How information is "lost" via chaos
Conserved quantities
Sequences of Markov Blankets as a tool to generalize chaos beyond time-dynamics
Objects (beginning of proto-lecture 3)
What does it mean for two chunks of atoms at two different times to "be the same object" or to "be two copies of the same object"?
What would mean for an object to "copy" over time, in a sense which could ground bio-like evolution in physics?
Abiogenesis and evolution of simple agents (proto-lecture 3, beginning of 4)
Autocatalytic reactions
Membranes/physical boundaries
Complex molecules from standardized parts: RNA world, proteins
Durable & heritable "blueprint": the genome
Transboundary transport
Internal compartments
Making "actions" a function of "observations"
Bistability -> memory
Consistent trade-offs -> implicit "prices"
Mobility
Multicellularity & Morphogenesis (proto-lecture 4)
Self-assembly at the molecular scale: bulk, tubes, surfaces
Sticky ball
Specialization again
Body axes
Gastrulation: boundaries again
Self-assembly at the multicell scale
Various cool patterning stuff
Specialized signal carriers
Signal processing
Minds (proto-lectures 5 and 6)
Within-lifetime selection pressure
Selection's implicit compression bias: grokking and the horribly-named "neuron picture"
Modularity: re-use requires modules
Factorization of problem domains: "environment specific, goal general"
Scarce channels hypothesis
Consistency pressure
General-purpose search
Representation & language
Self-model
Meta Commentary
Please feel free to play with these videos. I put zero effort into editing; if you want to clean the videos up and re-post them, go for it. (Note that I posted photos of the board in a comment below.)
Also, I strongly encourage people to make their own "Atoms to Agents" walkthroughs, based on their own models/understanding. It's a great exercise, and I'd love it if this were a whole genre.
This format started at a Topos-hosted retreat back in January. Eliana was posing questions about how the heck minds evolved from scratch, and it turned into a three-hour long conversation with Eliana, myself, Davidad, Vivek, Ramana, and Alexander G-O working our way through the stack. Highlight of the whole retreat. I tried a mini-version with Alex Turner a few months later, and then recorded these videos recently with Eli. The most fun version looks less like a lecture and more like a stream of questions from someone who's curious and digs in whenever hands are waved...
...more
5min
September 22, 2023 LW - Would You Work Harder In The Least Convenient Possible World? by Firinn
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Would You Work Harder In The Least Convenient Possible World?, published by Firinn on September 22, 2023 on LessWrong.
Part one of what will hopefully become the aspirant sequence.
Content note: Possibly a difficult read for some people. You are encouraged to just stop reading the post if you are the kind of person who isn't going to find it useful. Somewhat intended to be read alongside various more-reassuring posts, some of which it links to, as a counterpoint in dialogue with them. Pushes in a direction along a spectrum, and whether this is good for you will depend on where you currently are on that spectrum. Many thanks to Keller and Ozy for insightful and helpful feedback; all remaining errors are my own.
Alice is a rationalist and Effective Altruist who is extremely motivated to work hard and devote her life to positive impact. She switched away from her dream game-dev career to do higher-impact work instead, she spends her weekends volunteering (editing papers), she only eats the most ethical foods, she never tells lies and she gives 50% of her income away. She even works on AI because she abstractly believes it's the most important cause, even though it doesn't really emotionally connect with her the way that global health does. (Or maybe she works on animal rights for principled reasons even though she emotionally dislikes animals, or she works on global health even though she finds AI more fascinating; you can pick whichever version feels more challenging to you.)
Bob is interested in Effective Altruism, but Alice honestly makes him a little nervous. He feels he has some sort of moral obligation to make the world better, but he likes to hope that he's fulfilled that obligation by giving 10% of his income as a well-paid software dev, because he doesn't really want to have to give up his Netflix-watching weekends. Thinking about AI makes him feel scared and overwhelmed, so he mostly donates to AMF even though he's vaguely aware that AI might be more important to him. (Or maybe he donates to AI because he feels it's fascinating, even though he thinks rationally global health might have more positive impact or more evidence behind it - or he gives to animal rights because animals are cute. Up to you.)
Alice: You know, Bob, you claim to really care about improving the world, but you don't seem to donate as much as you could or to use your time very effectively. Maybe you should donate that money rather than getting takeout tonight?
Bob: Wow, Alice. It's none of your business what I do with my own money; that's rude.
Alice: I think the negative impact of my rudeness is probably smaller than the potential positive impact of getting you to act in line with the values you claim to have.
Bob: That doesn't even seem true. If everyone is rude like you, then the Effective Altruism movement will get a bad reputation, and fewer people will be willing to join. What if I get so upset by your rudeness that I decide not to donate at all?
Alice: That kind of seems like a you problem, not a me problem.
Bob: You're the one who is being rude.
Alice: I mean, you claim to actually seriously agree with the whole Drowning Child thing. If you would avoid doing any good at all, purely because someone was rude to you, then I think you were probably lying about being convinced of Effective Altruism in the first place, and if you're lying then it's my business.
Bob: I'm not lying; I'm just arguing why you shouldn't say those things in the abstract, to arbitrary people, who could respond badly. Sure, maybe they shouldn't respond badly, but you can't force everyone to be rational.
Alice: But I'm not going out and saying this to some abstract arbitrary person. Why shouldn't you, personally, work harder and donate more?
Bob: I'm protecting my mental health by ensuring that I only commit an am...
...more
13min
September 22, 2023 EA - AI is centralizing by default; let's not make it worse by Quintin Pope
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI is centralizing by default; let's not make it worse, published by Quintin Pope on September 22, 2023 on The Effective Altruism Forum.
TL;DR:
AIs will probably be much easier to control than humans due to (1) AIs having far more levers through which to exert control, (2) AIs having far fewer rights to resist control, and (3) research to better control AIs being far easier than research to control humans. Additionally, the economics of scale in AI development strongly favor centralized actors.
Current social equilibria rely on the current limits on the scalability of centralized control, and the similar levels of intelligence between actors with different levels of resources. The default outcome of AI development is to disproportionately increase the control and intelligence available to centralized, well-resourced actors. AI regulation (including pauses) can either reduce or increase the centralizing effects of AI, depending on the specifics of the regulations. One of our policy objectives when considering AI regulation should be preventing extreme levels of AI-enabled centralization.
Why AI development favors centralization and control:
I think AI development is structurally biased toward centralization for two reasons:
AIs are much easier to control than humans.
AI development is more easily undertaken by large, centralized actors.
I will argue for the first claim by comparing the different methods we currently use to control both AIs and humans and argue that the methods for controlling AIs are much more powerful than the equivalent methods we use on humans. Afterward, I will argue that a mix of regulatory and practical factors makes it much easier to research more effective methods of controlling AIs, as compared to researching more effective methods of controlling humans, and so we should expect the controllability of AIs to increase much more quickly than the controllability of humans. Finally, I will address five counterarguments to the claim that AIs will be easy to control.
I will briefly argue for the second claim by noting some of the aspects of cutting-edge AI development that disproportionately favor large, centralized, and well-resourced actors. I will then discuss some of the potential negative social consequences of AIs being very controllable and centralized, as well as the ways in which regulations (including pauses) may worsen or ameliorate such issues. I will conclude by listing a few policy options that may help to promote individual autonomy.
Why AI is easier to control than humans:
Methods of control broadly fall into three categories: prompting, training, and runtime cognitive interventions.
Prompting: influencing another's sensory environment to influence their actions.
This category covers a surprisingly wide range of the methods we use to control other humans, including offers of trade, threats, logical arguments, emotional appeals, and so on.
However, prompting is a relatively more powerful technique for controlling AIs because we have complete control over an AI's sensory environment, can try out multiple different prompts without the AI knowing, and often, are able to directly optimize against a specific AI's internals to make prompts that are maximally convincing for that particular AI.
Additionally, there are no consequences for lying to, manipulating, threatening, or otherwise being cruel to an AI. Thus, prompts targeting AIs can explore a broad range of possible deceptions, threats, bribes, emotional blackmail, and other tricks that would be risky to try on a human.
Training: intervening on another's learning process to influence their future actions.
Among humans, training interventions include parents trying to teach their children to behave in ways they deem appropriate, schools trying to teach their students various skills and ...
...more
26min
September 21, 2023 LW - AI #30: Dalle-3 and GPT-3.5-Instruct-Turbo by Zvi
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #30: Dalle-3 and GPT-3.5-Instruct-Turbo, published by Zvi on September 21, 2023 on LessWrong.
We are about to see what looks like a substantial leap in image models. OpenAI will be integrating Dalle-3 into ChatGPT, the pictures we've seen look gorgeous and richly detailed, with the ability to generate pictures to much more complex specifications than existing image models. Before, the rule of thumb was you could get one of each magisteria, but good luck getting two things you want from a given magisteria. Now, perhaps, you can, if you are willing to give up on adult content and images of public figures since OpenAI is (quite understandably) no fun.
We will find out in a few weeks, as it rolls out to ChatGPT+ users.
As usual a bunch of other stuff also happened, including a model danger classification system from Anthropic, OpenAI announcing an outside red teaming squad, a study of AI impact on consultant job performance, some incremental upgrades to Bard including an extension for GMail, new abilities to diagnose medical conditions and some rhetorical innovations.
Also don't look now but GPT-3.5-Turbo-Instruct plays Chess at 1800 Elo, and due to its relative lack of destructive RLHF seems to offer relatively strong performance at a very low cost and very high speed, although for most purposes its final quality is still substantially behind GPT-4.
Table of Contents
Introduction.
Table of Contents.
Language Models Offer Mundane Utility. GPT-4 boosts consultant productivity.
Language Models Don't Offer Mundane Utility. Do we want to boost that?
Level Two Bard. Some improvements, I suppose. Still needs a lot of work.
Wouldn't You Prefer a Good Game of Chess? An LLM at 1800 Elo. World model.
GPT-4 Real This Time. GPT-3.5-Instruct-Turbo proves its practical use, perhaps.
Fun With Image Generation. Introducing Dalle-3.
Deepfaketown and Botpocalypse Soon. Amazon limits self-publishing to 3 a day.
Get Involved. OpenAI hiring for mundane safety, beware the double-edged sword.
Introducing. OpenAI red team network, Anthropic responsible scaling policy.
In Other AI News. UK government and AI CEO both change their minds.
Technical Details. One grok for grammar, another for understanding.
Quiet Speculations. Michael Nielsen offers extended thoughts on extinction risk.
The Quest for Sane Regulation. Everyone is joining the debate, it seems.
The Week in Audio. A lecture about copyright law.
Rhetorical Innovation. We keep trying.
No One Would Be So Stupid As To. Are we asking you to stop?
Aligning a Smarter Than Human Intelligence is Difficult. Asimov's laws? No.
I Didn't Do It, No One Saw Me Do It, You Can't Prove Anything. Can you?
People Are Worried About AI Killing Everyone. Yet another round of exactly how.
Other People Are Not As Worried About AI Killing Everyone. Tony Blair.
The Lighter Side. Jesus flip the tables.
Language Models Offer Mundane Utility
Diagnose eye diseases. This seems like a very safe application even with false positives, humans can verify anything the AI finds.
Diagnose foetal growth restrictions early.
In theory and technically using graph neural networks, use the resulting 'reading mode' in Android or Chrome to strip out the words from a webpage, in an actually readable size and font, much more accurate than older attempts. Seems you have to turn it on under chrome flags.
GPT-4 showing some solid theory of mind in a relatively easy situation. Always notice whether you are finding out it can do X consistently, can do X typically, or can do X once with bespoke prompting.
The same with failure to do X. What does it mean that a model would ever say ~X, versus that it does all the time, versus it does every time? Each is different.
How to convince people who are unimpressed by code writing that LLMs are not simply parrots? Eliezer asked on Twitter, and said ...
...more
1h 14min
September 21, 2023 EA - Will MacAskill has stepped down as trustee of EV UK by lincolnq
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Will MacAskill has stepped down as trustee of EV UK, published by lincolnq on September 21, 2023 on The Effective Altruism Forum.
Earlier today, Will MacAskill stepped down from the board of Effective Ventures UK[1], having served as a trustee since its founding more than a decade ago.
Will has been intending to step down for several months and announced his intention to do so earlier this year. Will had initially planned to remain on the board until we brought on additional trustees to replace him. However, given that our trustee recruitment process has taken longer than anticipated, and given also that Will continues to be recused from a significant proportion of board business[2], he felt that it didn't make sense for him to stay on any longer.
Will announced his resignation today.
As a founding board member of EV UK (then called CEA), Will played a vital role in getting EV and its constituent projects off the ground, including co-founding Giving What We Can and 80,000 Hours. We are very grateful to Will for everything he's contributed to the effective altruism movement to date and look forward to his future positive impact; we wish him the best of luck with his future work.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
...more
2min

FAQs about The Nonlinear Library:

How many episodes does The Nonlinear Library have?

The podcast currently has 9,862 episodes available.