Kinect gaming accessory hacked for teleconferencing

Published April 5, 2011 9:45pm

Kinect, the popular motion-detecting gaming accessory from Microsoft, may turn out to be an effective tool for videoconferencing as well. A group of students from the Massachusetts Institute of Technology hacked the Kinect to "enhance" distance-based Internet communication. "The proliferation of broadband and high-speed Internet access has, in general, democratized the ability to commonly engage in videoconference. However, current video systems do not meet their full potential, as they are restricted to a simple display of unintelligent 2D pixels. We present a system for enhancing distance-based communication by augmenting the traditional video conferencing system with additional attributes beyond two-dimensional video," Lining Yao, Anthony DeVincenzi, Ramesh Raskar, and Hiroshi Ishii said in their paper. "With Kinect camera and sound sensors, We explore how expanding a system's understanding of spatially calibrated depth and audio alongside a live video stream can generate semantically rich three-dimensional pixels containing information regarding their material properties and location," they added. Using a Kinect camera and sound sensors, the students indicated at least four features that can enhance the videoconference:

Talking to Focus, where the system focuses on those currently speaking and can blur those who are not. The system can also display vital information about the speaker, including name and speaking time.
Freezing Former Frames, where people who do not want to be noticed by the other side can freeze themselves and make a still image for a short time - handy if one wants to pretend to sit and listen, but is really checking email or engaged in a short conversation.
Privacy Zone, where the user can render himself or herself, or a specified area, invisible with a gestural command. The simulation will not interrupt objects moving in the foreground.
Spacial Augmenting Reality, where people can click certain objects on the screen and see the augmented information remotely.

The setup includes two networked locations, each with a video screen for viewing the opposite space; a standard RGB digital web camera enhanced by a depth-sensing 3D camera like the Kinect; and calibrated microphones. C++ and the openFrameworks library are used for video processing and effect rendering, the MIT paper said. â€” TJD, GMA News