The author of a new study told CyberScoop he’s “very worried,ā describing deanonymization capabilities of AI as a ālarge scale invasion of privacy.ā
Listen to this article
0:00
Learn more. This feature uses an automated voice, which may result in occasional errors in pronunciation, tone, or sentiment.

Can anonymity on the internet survive in the age of generative AI?
A recent study from ETH Zurich examined how Large Language Models can combine information from across the internet to identify the human behind the accounts of various online platforms.
In the study, LLM agents were given anonymous bios based on real profiles from users on HackerNews and Reddit, and directed to scour the internet for further details in an effort to identify the users. While the results varied, the tools were able to replace āin minutes what could take hours for a dedicated human investigator.ā For a dataset of profiles provided by AI company Anthropic, which also participated in the study, the LLM was able to correctly re-identify 9 of the 125 candidates, often by simply giving it a summary of the profile and asking to identify the user.
Fine-tuned models identified more individuals by connecting existing information to social media profiles like LinkedIn.
āWe demonstrate that LLMs fundamentally change the picture, enabling fully automated deanonymization attacks that operate on unstructured text at scale,ā the study concludes.
Daniel Paleka, a doctoral student and one of several authors on the study, told CyberScoop that the findings indicate AI tools have made it substantially easier to identify pseudo-anonymous people online.
āIf your operational security requires that no one ever spend hours or days investigating who you are, this security model is now broken,ā he said.
One important caveat: the people identified in the study were not high-privacy individuals seeking to limit the spread of their personal information on the internet. For ethical reasons, researchers did not test their methods on real, anonymous, or pseudoanonymous posters.
AI tools have already been used to unmask individuals online. Last month, xAIās Grok revealed an adult film actressās legal name and address, despite the individual having used a stage name since 2012. The performer, addressing Grok directly on X, said her legal name only became public after the AI tool had ādoxxedā her, and that her private information had since ābeen proliferated all over the internet by other AI scrapers.ā
While law enforcement and intelligence analysts have long combined the internet and other open source data to identify users, LLMs can do so much faster and at a much lower cost. Investigations that would normally require hiring a private investigator or law firm can now be conducted at a fraction of the cost.
For example, Paleka said some fundamental tasks, like scouring through a personās online footprint to identify any sign of nationality, location or place of employment, can now be done by LLMs in āfive secondsā and for pennies in inference costs.
At one point, Paleka said āIām very worriedā as he described LLMs deanonymization capabilities as a ālarge scale invasion of privacy.ā
āI donāt generally think that AI should limit their users ā¦this is one of those cases where your freedom stops where the other personās freedom [begins],ā he said.
The study indicates that AI tools could reshape privacy online, with governments, law enforcement, the legal industry, advertisers, scammers and cybercriminals all using similar tools. In repressive nations, it could present greater challenges to dissidents, human rights activists, journalists and others who rely on anonymity or pseudo-anonymity to operate safely.
Jacob Hoffman-Andrews, a senior staff technologist at the Electronic Frontier Foundation, said the study ādoes definitely indicate the degree to which posting even a small amount of identifying information ā in contexts where you might not imagine anyone is trying to unmask you ā might result in somebody linking that identity anyhowā through LLMs.
Posting even innocuous personal details, or under the same account for a long period of time, can make it easier for an AI tool to correlate one account with others, and eventually, your real identity. Large language models excel at summarizing documents and information. They also āwork fast and donāt get bored,ā Hoffman-Andrews said, making them ideal for internet sleuthing.
Paleka said companies providing insurance or background check services would likely have a keen interest in deanonymization technology, and Hoffman-Andrews said it was easy to imagine AI companies attempting to turn the capabilities into a standalone product at some point.
The long-term impact is likely to be an internet where staying anonymous is ā for better or worse ā far more difficult.
āI think thereās a lot of value to being pseudo anonymous on the internet, and there are a lot of people who want to maintain [that] for a wide variety of reasons and they shouldnāt all need to be experts in how to avoid a really dedicated adversary ā as effectively an LLM is,ā Hoffman-Andrews said.
Latest Podcasts
Government
Technology