“We want to make video conference calls as similar as possible to a real meeting,” explains Claudia Kuster, a doctoral student at the Computer Graphics Laboratory ETH Zurich. Lack of eye contact is said to be a considerable obstacle to the feel of a ‘real’ conversation. This problem arises because the speaker looks mainly at their counterpart’s picture instead of at the camera. Kuster and her colleagues are now offering a solution to the problem for everyday use: software that recognises the face in the video and rotates it so that the person appears to be looking at the camera.
Until now, only larger companies have been able to afford the luxury of creating artificial eye contact during video conferences; this has required complex mirror systems or several cameras and special software. No satisfactory solution to the problem has existed for private use.
Depth map and facial recognition
This is now changing thanks to the new software that Kuster has developed under the guidance of Markus Gross, Professor of Computer Science at ETH Zurich. Thanks to Kinect, a new generation of cameras that collect colour and depth information simultaneously, the system is available for home use. The software developed by Kuster uses a depth map calculated from the image information and a programme that recognises faces in real-time video.
In contrast to previous solutions, Kuster and her colleagues do not turn the entire video image including the background, thus avoiding the problem of missing information in the original image resulting in gaps appearing in the rotated picture. Instead, her algorithm turns only the face and inserts it seamlessly into the original image. The software looks for a contour around the face in which the border pixel in the original and the corresponding pixel in the rotated image have as many similar colour values as possible.
Robust under adverse conditions
“The software can be user adjusted in just a few simple steps and is very robust,” says Kuster. If the programme temporarily loses sight of the face – for example, if the person turns their head or disappears behind an object such as a cup – the software leaves the original image in place. The software can cope effortlessly with changing light conditions and even two faces at the same time, as the researchers demonstrate in a video. However, in the current version of the software, glasses prevent facial recognition, so wearers must take them off for video conferences.
Although the new generation of cameras with depth sensors are affordable, they are still more expensive than a standard webcam. In addition, current laptops, tablets and smartphones are not yet equipped with this technology. Kuster and her colleagues are planning to further develop the software for mobile devices with standard webcams, and simplify it for the user as much as possible. The researchers also want to develop a Skype plug-in that users can easily install to maintain eye contact in the future.
Source: ETH Zurich