February 24, 2026

“Large-Scale Online Deanonymization with LLMs” by Simon Lermen, Daniel Paleka

8 minutes

TL;DR: We show that LLM agents can figure out who you are from your anonymous online posts. Across Hacker News, Reddit, LinkedIn, and anonymized interview transcripts, our method identifies users with high precision – and scales to tens of thousands of candidates.

While it has been known that individuals can be uniquely identified by surprisingly few attributes, this was often practically limited. Data is often only available in unstructured form and deanonymization used to require human investigators to search and reason based on clues. We show that from a handful of comments, LLMs can infer where you live, what you do, and your interests – then search for you on the web. In our new research, we show that this is not only possible but increasingly practical.

Paper: Large-Scale Online Deanonymization with LLMs

Motivation – Why do we research this?

Among the near-term effects of AI, different forms of AI surveillance pose some of the most concrete harms. It is already known that LLMs can infer personal attributes about authors and use that to create biographical profiles of individuals (also see). Such profiles can be misused straightforwardly with spear-phishing or many other forms of monetizing exploits. Using AI [...]

---

Outline:

(01:16) Motivation - Why do we research this?

(02:13) How We Designed Our Benchmarks

(02:48) Proxy 1: Cross-Platform Matching

(03:53) Proxy 2: Matching Split Accounts

(04:39) It Scales to Much Larger Datasets

(05:17) Identifying people in the real world

(06:03) What Now?

---

First published:

February 24th, 2026