Language identifies Users of Online Underground Markets

Aylin Caliskan Islam und Sadia Afroz hielten auf dem 29c3 einen superspannenden Vortrag mit dem Titel Stylometry and Online Underground Markets, in dem es um die Identifikation anonymer User anhand ihres „Sprachabdrucks“ ging, wobei: „Leetspeak, an alternative alphabet popular in some forum circles, cannot be translated“. Mir war das als interessierter Linguistik-n00b allerdings insgesamt zu hoch, aber das Security Business Mag hat den Vortrag für Normalnerds runtergebrochen:

Up to 80 percent of certain anonymous underground forum users can be identified using linguistics, researchers say. The techniques compare user posts to track them across forums and could even unveil authors of thesis papers or blogs who had taken to underground networks. "If our dataset contains 100 users we can at least identify 80 of them," researcher Sadia Afroz told an audience at the 29C3 Chaos Communication Congress in Germany.

"Function words are very specific to the writer. Even if you are writing a thesis, you'll probably use the same function words in chat messages. "Even if your text is not clean, your writing style can give you away." The analysis techniques could also reveal botnet owners, malware tool authors and provide insight into the size and scope of underground markets, making the research appealing to law enforcement.

Linguistics identifies anonymous users: Researchers reveal carders, hackers on underground forums (via /.)