Measuring the Lulz of funny Videos with the amount of O in the LOLs

Google hält grade 'nen Comedy-Slam ab und lässt User für die lustigsten Videos abstimmen. So weit, so uninteressant. Die Kandidaten für die Videos haben sie aber unter anderem dadurch ausgesucht, indem sie die LOLs in den Comments semantisch untersucht haben – und das ist im Gegensatz zu Muhaha-Videos tatsächlich sehr interessant:

We needed an algorithm to rank these funny videos by comedic potential, e.g. is “Charlie bit my finger” funnier than “David after dentist”? Raw viewcount on its own is insufficient as a ranking metric since it is biased by video age and exposure. We noticed that viewers emphasize their reaction to funny videos in several ways: e.g. capitalization (LOL), elongation (loooooool), repetition (lolololol), exclamation (lolllll!!!!!), and combinations thereof. If a user uses an “loooooool” vs an “loool”, does it mean they were more amused? We designed features to quantify the degree of emphasis on words associated with amusement in viewer comments. We then trained a passive-aggressive ranking algorithm using human-annotated pairwise ground truth and a combination of text and audiovisual features. Similar to Music Slam, we used this ranker to populate candidates for human voting for our Comedy Slam.

Quantifying comedy on YouTube: why the number of o’s in your LOL matter (via /.)