August 25, 2023

LW - AI #26: Fine Tuning Time by Zvi

52 minutes

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #26: Fine Tuning Time, published by Zvi on August 25, 2023 on LessWrong.

GPT-3.5 fine tuning is here. GPT-4 fine tuning is only a few months away. It is about to get a lot easier to get a powerful system that does what you want it to do, and knows what you want it to know, especially for the purposes of a business or a website.

As an experiment, I am putting in bold the sections I think are worth highlighting, as unusually important or interesting versions of the thing than in a typical week.

Table of Contents

Introduction.

Table of Contents.

Language Models Offer Mundane Utility. Claude-2 versus GPT-4.

Language Models Don't Offer Mundane Utility. No opinions, no agents.

Fact Check: Misleading. AI fact checker makes people more confused not less.

GPT-4 Real This Time. Fine tune GPT-3.5, soon GPT-4. Ask it if it's sure.

Fun With Image Generation. MidJourney inpainting ho. And oh no AI porn.

Deepfaketown and Botpocalypse Soon. Adversarial examples starting to emerge.

They Took Our Jobs. New York Times joins copyright lawsuits against OpenAI.

Introducing. Palisade Research will study potentially dangerous AI affordances.

In Other AI News. Who is adapting fastest to AI? An attempt to measure that.

Quiet Speculations. Jack Clark asks questions about what the future will bring.

The Quest for Sane Regulation. FTC asks OpenAI a different sort of question.

The Week in Audio. It's win win.

No One Would Be So Stupid As To. Make an AI conscious? Oh, come on.

Aligning a Smarter Than Human Intelligence is Difficult. Evidence for IDA?

People Are Worried About AI Killing Everyone. Polling numbers are very clear.

The Lighter Side. Only half there.

Language Models Offer Mundane Utility

Which model is better, Claude-2 or GPT-4?

Rowan Cheung makes the case that Claude 2 is superior. You get the 100k context window, ability to upload multiple files, data through early 2023 (versus late 2021) and faster processing time, all for free. In exchange, you give up plug-ins and it is worse at math. What Rowan does not mention is that GPT-4 has the edge in raw intelligence and general capability, and also the ability to set system instructions is helpful. He implies he isn't even paying the $20/month for GPT-4, which strikes me as insane.

My verdict in practice is that by default I will use Claude-2. If I care about response quality I will use both and compare. When Claude-2 is clearly falling on its face, I'll go to GPT-4. On reflection, 'use both' is most often the correct strategy.

He also looks at the plugins. There are so many plugins, at least 867 of them. Which are worth using?

He recommends Zapier for automating through trigger actions, ChatWithPDF (I use Claude 2 for this), Wolfram Alpha for real-time data and math, VoxScript for YouTube video transcripts and web browsing, WebPilot which seems duplicative, Website Performance although I'm not sure why you'd use an AI for that, ScholarAI for searching papers, Shownotes to summarize podcasts (why?), ChatSpot for marketing and sales data and Expedia for vacation planning.

I just booked a trip, and went on two others recently, and it didn't occur to me to use the Expedia plug-in rather than, among other websites, Expedia (my go-to plan is Orbitz for flights and Google Maps for hotels). Next time I should remember to try it.

Study claims that salience of God increases acceptance of AI decisions. I would wait for the replication on this one. If it is true, it points out that there will be various ways for AIs to tip the scales towards us accepting their decisions, or potentially for humans to coordinate to turn against AI, that don't have much to do with any relevant considerations. Humans are rather buggy code.

Matt Shumer recommends a GPT-4 system message.

Use it to you help make engineering decisions in unfamiliar territory:

You are an e...

...more

View all episodes

By The Nonlinear Fund

4.6

88 ratings