
Sign up to save your podcasts
Or


I'm making a website on AI companies' model evals for dangerous capabilities: AI Safety Claims Analysis. This is approximately the only analysis of companies' model evals, as far as I know. This site is in beta; I expect to add lots more content and improve the design in June. I'll add content on evals, but I also tentatively plan to expand from evals to evals and safeguards and safety cases (especially now that a company has said its safeguards are load-bearing for safety!).
Some cherry-picked bad stuff I noticed when I read the most recent model card from each company (except Claude 3.7 rather than Claude 4) below, excerpted/adapted from an earlier version of the site.
OpenAI: OpenAI says its models don't meaningfully uplift novices in creating biothreats. But it provides no justification for this claim, and its evals suggest that the models are more capable than human experts.
[...]
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
By LessWrongI'm making a website on AI companies' model evals for dangerous capabilities: AI Safety Claims Analysis. This is approximately the only analysis of companies' model evals, as far as I know. This site is in beta; I expect to add lots more content and improve the design in June. I'll add content on evals, but I also tentatively plan to expand from evals to evals and safeguards and safety cases (especially now that a company has said its safeguards are load-bearing for safety!).
Some cherry-picked bad stuff I noticed when I read the most recent model card from each company (except Claude 3.7 rather than Claude 4) below, excerpted/adapted from an earlier version of the site.
OpenAI: OpenAI says its models don't meaningfully uplift novices in creating biothreats. But it provides no justification for this claim, and its evals suggest that the models are more capable than human experts.
[...]
---
First published:
Source:
---
Narrated by TYPE III AUDIO.

26,365 Listeners

2,432 Listeners

8,971 Listeners

4,148 Listeners

92 Listeners

1,595 Listeners

9,913 Listeners

90 Listeners

72 Listeners

5,475 Listeners

16,076 Listeners

536 Listeners

131 Listeners

95 Listeners

515 Listeners