
Sign up to save your podcasts
Or
This is a post to officially announce the sae-vis library, which was designed to create feature dashboards like those from Anthropic's research.
Summary.There are 2 types of visualisations supported by this library: feature-centric and prompt-centric.
The feature-centric vis is the standard from Anthropic's post, it looks like the image below. There's an option to navigate through different features via a dropdown in the top left.
You can see the interactive version at the GitHub repo, at _feature_vis_demo.html.The prompt-centric vis is centred on a single user-supplied prompt, rather than a single feature. It will show you the list of features which score highest on that prompt, according to a variety of different metrics. It looks like the image below. There's an option to navigate through different possible metrics and choices of token in your prompt via a dropdown in the top left.
You can see the interactive version at [...]---
First published:
Source:
Narrated by TYPE III AUDIO.
This is a post to officially announce the sae-vis library, which was designed to create feature dashboards like those from Anthropic's research.
Summary.There are 2 types of visualisations supported by this library: feature-centric and prompt-centric.
The feature-centric vis is the standard from Anthropic's post, it looks like the image below. There's an option to navigate through different features via a dropdown in the top left.
You can see the interactive version at the GitHub repo, at _feature_vis_demo.html.The prompt-centric vis is centred on a single user-supplied prompt, rather than a single feature. It will show you the list of features which score highest on that prompt, according to a variety of different metrics. It looks like the image below. There's an option to navigate through different possible metrics and choices of token in your prompt via a dropdown in the top left.
You can see the interactive version at [...]---
First published:
Source:
Narrated by TYPE III AUDIO.
26,446 Listeners
2,389 Listeners
7,910 Listeners
4,136 Listeners
87 Listeners
1,462 Listeners
9,095 Listeners
87 Listeners
389 Listeners
5,432 Listeners
15,174 Listeners
474 Listeners
121 Listeners
75 Listeners
461 Listeners