Algorithmic Eyeroll


NeuralNetwork-Fuzzies am Skolkovo Institute of Science and Technology haben eine künstliche Intelligenz darauf trainiert, Bickrichtungen in Bildern zu korrigieren: DeepWarp: Photorealistic ImageResynthesis for Gaze Manipulation. Erinnert an Tom Whites Smile-Vector und ist genau wie seine Arbeit eine (noch) sehr aufwändige Spielerei, die man derzeit (noch) besser mit Photoshop hinbekommt und die (noch) nur bei Stills funktioniert. (via CreativeAI)

In this work, we consider the task of generating highly-realistic images of a given face with a redirected gaze. We treat this problem as a specific instance of conditional image generation, and suggest a new deep architecture that can handle this task very well as revealed by numerical comparison with prior art and a user study. Our deep architecture performs coarse-to-fine warping with an additional intensity correction of individual pixels.

All these operations are performed in a feed-forward manner, and the parameters associated with different operations are learned jointly in the end-to-end fashion. After learning, the resulting neural network can synthesize images with manipulated gaze, while the redirection angle can be selected arbitrarily from a certain range and provided as an input to the network.

[update] Haha, kurz nach Veröffentlichung des Postings rutschte das hier durch meine Timeline, eine Mimik- und Blickrichtungs-Erkennung für VR.




Da es keine Datensätze für Blickrichtungen gibt, haben sie für ihre Arbeit einfach ein paar Testpersonen in eine Clockwork-Orange-Vorrichtung geklemmt und auf einen wandernden Punkt glotzen lassen:


There are no publicly available datasets suitable for the purpose of the gaze correction task with continuously varying redirection angle. Therefore, we collect our own dataset Figure 4. To minimize head movement, a person places his head on a special stand and follows with her gaze a moving point on the screen in front of the stand. While the point is moving, we record several images with eyes looking in different fixed directions (about 200 for one video sequence) using a webcam mounted in the middle of the screen. For each person we record 2 − 10 sequences, changing the head pose and light conditions between different sequences. Training pairs are collected, taking two images with different gaze directions from one sequence. We manually exclude bad shots, where a person is blinking or where she is not changing gaze direction monotonically as anticipated. Most of the experiments were done on the dataset of 33 persons and 98 sequences.






Hier eine non-algorithmische Alternative mit Kinski und Nosferatu vom großartigen Ensalada-Tumblr: