A.I. predicts High-Fives, Hugs and Kisses

kiss

Im AI-Lab vom MIT haben sie ein paar Neural Networks auf Umarmungen, Knutschen und High-Fives aus Youtube-Clips und Fernsehserien trainiert, die sie dann vorhersagen konnten: Teaching machines to predict the future. Die Erfolgsquote ist zwar noch gering (43%), aber Körpersprache lesende Maschinen, die daraufhin Motive und Handlungen von Menschen vorhersagen, dürften bei der jetzigen Innovationsgeschwindigkeit nicht mehr allzulange weit weg sein. Und Menschen können zukünftige Aktionen ebenfalls nur mit 71% Wahrscheinlichkeit vorhersagen, Maschinen dürften uns hierbei schon sehr bald überholen. Und nun addiere man Predictive Crime und Überwachungskameras zu dieser Gleichung. Yay!

This week researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have made an important new breakthrough in predictive vision, developing an algorithm that can anticipate interactions more accurately than ever before. Trained on YouTube videos and TV shows such as “The Office” and “Desperate Housewives,” the system can predict whether two individuals will hug, kiss, shake hands or slap five. In a second scenario, it could also anticipate what object is likely to appear in a video five seconds later. […]

After training the algorithm on 600 hours of unlabeled video, the team tested it on new videos showing both actions and objects. When shown a video of people who are one second away from performing one of the four actions, the algorithm correctly predicted the action more than 43 percent of the time, which compares to existing algorithms that could only do 36 percent of the time.

In a second study, the algorithm was shown a frame from a video and asked to predict what object will appear five seconds later. For example, seeing someone open a microwave might suggest the future presence of a coffee mug. The algorithm predicted the object in the frame 30 percent more accurately than baseline measures, though the researchers caution that it still only has an average precision of 11 percent.

It’s worth noting that even humans make mistakes on these tasks: for example, human subjects were only able to correctly predict the action 71 percent of the time.