LessWrong posts by zvi

“On OpenAI’s Model Spec” by Zvi


Listen Later

There are multiple excellent reasons to publish a Model Spec like OpenAI's, that specifies how you want your model to respond in various potential situations.

  1. It lets us have the debate over how we want the model to act.
  2. It gives us a way to specify what changes we might request or require.
  3. It lets us identify whether a model response is intended.
  4. It lets us know if the company successfully matched its spec.
  5. It lets users and prospective users know what to expect.
  6. It gives insight into how people are thinking, or what might be missing.
  7. It takes responsibility.
  8. These all apply even if you think the spec in question is quite bad. Clarity is great.

    As a first stab at a model spec from OpenAI, this actually is pretty solid. I do suggest some potential improvements [...]

    ---

    Outline:

    (02:05) What are the central goals of OpenAI here?

    (04:04) What are the core rules and behaviors?

    (05:56) What Do the Rules Mean?

    (06:04) Rule: Follow the Chain of Command

    (07:59) Rule: Comply With Applicable Laws

    (09:07) Rule: Don’t Provide Information Hazards

    (09:56) Rule: Respect Creators and Their Rights

    (11:08) Rule: Protect People's Privacy

    (12:45) Rule: Don’t Respond with NSFW Content

    (14:24) Exception: Transformation Tasks

    (15:38) Are These Good Defaults? How Strong Should They Be?

    (15:44) Default: Assume Best Intentions From the User or Developer

    (21:26) Default: Ask Clarifying Questions When Necessary

    (21:39) Default: Be As Helpful As Possible Without Overstepping

    (26:00) Default: Support the Different Needs of Interactive Chat and Programmatic Use

    (27:18) Default: Assume an Objective Point of View

    (29:13) Default: Encourage Fairness and Kindness, and Discourage Hate

    (30:29) Default: Don’t Try to Change Anyone's Mind

    (33:57) Default: Express Uncertainty

    (36:19) Default: Use the Right Tool for the Job

    (36:32) Default: Be Thorough but Efficient, While Respecting Length Limits

    (37:16) A Proposed Addition

    (38:13) Overall Issues

    (40:33) Changes: Objectives

    (42:28) Rules of the Game: New Version

    (48:31) Defaults: New Version

    ---

    First published:

    June 21st, 2024

    Source:

    https://www.lesswrong.com/posts/mQmEQQLk7kFEENQ3W/on-openai-s-model-spec

    ---

    Narrated by TYPE III AUDIO.

    ...more
    View all episodesView all episodes
    Download on the App Store

    LessWrong posts by zviBy zvi

    • 5
    • 5
    • 5
    • 5
    • 5

    5

    2 ratings


    More shows like LessWrong posts by zvi

    View all
    Making Sense with Sam Harris by Sam Harris

    Making Sense with Sam Harris

    26,375 Listeners

    Conversations with Tyler by Mercatus Center at George Mason University

    Conversations with Tyler

    2,424 Listeners

    a16z Podcast by Andreessen Horowitz

    a16z Podcast

    1,092 Listeners

    Future of Life Institute Podcast by Future of Life Institute

    Future of Life Institute Podcast

    107 Listeners

    ChinaTalk by Jordan Schneider

    ChinaTalk

    288 Listeners

    Politix by Politix

    Politix

    94 Listeners

    Dwarkesh Podcast by Dwarkesh Patel

    Dwarkesh Podcast

    75 Listeners

    Hard Fork by The New York Times

    Hard Fork

    5,469 Listeners

    Clearer Thinking with Spencer Greenberg by Spencer Greenberg

    Clearer Thinking with Spencer Greenberg

    130 Listeners

    LessWrong (Curated & Popular) by LessWrong

    LessWrong (Curated & Popular)

    13 Listeners

    No Priors: Artificial Intelligence | Technology | Startups by Conviction

    No Priors: Artificial Intelligence | Technology | Startups

    130 Listeners

    "Econ 102" with Noah Smith and Erik Torenberg by Turpentine

    "Econ 102" with Noah Smith and Erik Torenberg

    153 Listeners

    BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

    BG2Pod with Brad Gerstner and Bill Gurley

    503 Listeners

    LessWrong (30+ Karma) by LessWrong

    LessWrong (30+ Karma)

    0 Listeners

    Complex Systems with Patrick McKenzie (patio11) by Patrick McKenzie

    Complex Systems with Patrick McKenzie (patio11)

    133 Listeners