Ever since the dawn of time, man has worked tirelessly to make life better. With the advent of computers and the internet, communication and information transfer has become extremely easy. But there’s always been one common way of interacting with these machines.
No matter how powerful and complex, you always have to be near a machine and somehow in physical contact with it to interact with it. But gesture recognition technology could change all that. If perfected and used correctly, it could actually render traditional input devices like keyboard, mice and touch screens redundant. Read on to find out more! In simple words, gesture recognition is the science of interpreting human gestures as input commands using mathematical algorithms. This may include full body motion recognition, or something small like a change in facial expression. Although it might seem small, however, reading facial expressions is in fact a more difficult task than recognizing more pronounced gestures.
Generally, gestures are classified into two types. Depending on the level of interaction required, different types are used, and it’s not always necessary to use one over the other, it does make things simpler to understand though. The two types of gestures are:
- Online Gestures – Put simply, these are gestures that control a machine or computer system in real time. Also called direct manipulation gestures, these are the sort that let you interact with objects and impose changes viewable as you do them.
- Offline Gestures – Gestures that are processed after they’re done are called offline gestures. In other words, these are the gestures that don’t show a real time change. For example, some new smartphones have the option to open a certain application after a specific gesture is made, this is basically an offline gesture.
Besides being really functional, it cannot be argued against that gesture control looks really cool. Like in the 2002 hit movie Minority Report, there will probably come a time where everything is controlled by gestures. While it might seem like a technology that will only increase our lethargy, truth is that other than making life easier, it also has a vast array of applications in almost all fields. Some of the applications include:
- Medical Applications – Advanced robotics systems with gesture recognition can be placed in hospitals or homes to recognize and treat life threatening conditions like heart attacks or strokes.
- Alternative computer interfaces – Gesture recognition, along with voice recognition, facial recognition, lip movement recognition and eye tracking combined can be used to create something called a perceptual user interface (PUI), a completely different way to interact with computer systems which will improve usability and creativity by leaps and bounds.
- Entertainment applications – Most videogames today are played either on game consoles, arcade units or PCs, and all require a combination of input devices. Gesture recognition can be used to truly immerse a players in the game world like never before.
- Automation systems – In homes, offices, transport vehicles and more, gesture recognition can be incorporated to greatly increase usability and reduce the resources necessary to create primary or secondary input systems like remote controls, car entertainment systems with buttons or similar.
- An easier life for the disabled – One of the biggest challenges faced today is providing separate and equally non cumbersome services to the differently abled and handicapped. While there are special provisions around the world, there’s still huge room for improvement to bring all lives on equal footing. Gesture recognition technology can eliminate a lot of manual labor and make life much easier for those who aren’t as fortunate as most of us are.
These are just a handful of the places and situations in which gesture recognition technology can be implemented, and as is evident, can totally change the way we interact with the world around us, not only at home, but in commercial venues as well. In fact, a South African company had come up with an innovative machine placed at the Tambo International Airport that detected travellers who yawned or looked sleepy and dispensed free cups of coffee. Although it used only basic facial and gesture recognition technology, it is nonetheless an interesting look into what can be done with this technology. Currently, there aren’t too many gesture recognition applications available for public use, but interestingly, despite its potential for real world applications, gesture recognition technology is actually dominated by the videogame industry. Electronics giants Microsoft and Sony, makers of the Xbox and PlayStation line of consoles respectively, have incorporated gesture recognition to an extent into their entertainment systems, via extra hardware. Called ‘Kinect’ in the case of Microsoft and the ‘PlayStation Eye/Camera’ in the case of Sony, these amazing devices bring us one step closer to the future. While Microsoft in 2014 has gone ahead and included the Kinect 2.0 camera with the Xbox One, their latest gaming console and made gesture and voice control an integral part of it, Sony has left the PlayStation Camera as an accessory for the PlayStation 4, instead focusing on traditional input methods.
First introduced with as an accessory to the Xbox 360 in 2009, the Kinect was not exactly a huge hit, but paved the way for meaningful gesture controlled gaming. The initial iteration was based on camera technology from Israeli developer PrimeSense, and featured an RGB camera, depth sensor and multi array microphone to provide facial, gesture and voice recognition. The depth laser was basically an infrared laser projector and CMOS sensor that let the device capture point to point 3D data regardless of ambient light, an extremely important factor that would vary wildly from household to household, or for that matter, anywhere the console is used.
The first version of Kinect captured images and video at a resolution of 640 x 480 pixels with 2048 levels of sensitivity in terms of depth. It borrowed processing power from the main console, and was fairly power efficient. Capable of tracking up to six people and two active players for motion analysis, the Kinect was a brave step forward, bringing gesture recognition technology to homes everywhere. The Kinect 2.0, launched with the Xbox One and coming soon to Microsoft Windows, bumped up the specs quite a fair bit, making it much more sensitive to data. So sensitive in fact, that it can even monitor your heart rate. Combined with voice and facial recognition technology, the Kinect 2.0 lets you control the Xbox One console entirely without using a controller. The maximum resolution for the Kinect was bumped up to full HD resolution (1920 x 1080 px) and almost doubles the power of the depth sensor used previously. With a wider field of view, more powerful microphones, better skeletal tracking and defined joints tracking, the Kinect 2.0 is an extremely powerful device that also has exciting applications in medical science and real life. In fact, there are developers who are working on incorporating its power into medical applications. In current usage, a user is able to wave his or her hands around to control the menu as well as interact with the console using voice, which by the way, also recognizes users by face. This is very close to the PUI we mentioned earlier, isn’t it?
Sony PlayStation Eye/Camera
First released as the PlayStation EyeToy for the venerable PlayStation 2 console, the EyeToy was simply a glorified webcam, and didn’t really track gestures as accurately as its successors would. There were games released using this mechanism, but weren’t well received because of the low sensitivity. It basically just captured your image/video and pasted it in the game, and didn’t really track you. The PlayStation Eye was the next product in Sony’s line of gesture recognition technology for entertainment systems/game consoles. Launched in 2007, it could capture video at 640 x 480 px and was said to be ‘two times as sensitive’ as the PlayStation EyeToy. Again, it was just an evolution of its predecessors, and didn’t bring much to the table in terms of tracking. Used for simple Digimask face mapping, video chatting, edge detection and color tracking, the PlayStation Eye was a small step forward for Sony. The PlayStation Camera for the PlayStation 4 is actually similar to what Microsoft has been doing with the Kinect range of devices. Featuring two 1280 x 800 pixel cameras with an 85 degree field of view that can be used for depth tracking, motion sensing and other functions depending on applications, the PlayStation Camera is another really interesting chapter in gesture recognition technology. It works in conjunction with the PlayStation 4 controller to track motion, faces, people and voices, quite like the Kinect. Most famously used to show off the “Playroom” demo, the PlayStation Camera is an interesting device made slightly differently but for the same purpose.
Now for what is perhaps the most accessible and easy to use gesture recognition platform, Leap Motion. Growing from investor interest in 2010, Leap Motion just started selling their devices in July 2013 commercially. So what exactly is Leap Motion? It’s a hardware sensor device that tracks hand and finger motions and translates it into input. The Leap Motion controller is a small USB powered device compatible with Windows and Macintosh that uses two monochromatic IR cameras and three IR LEDs to track movements and motion made by hands and fingers in a roughly 1m hemispherical 3D space. The cameras reportedly generate 300 frames per second of data, which is analysed and interpreted by proprietary software. It’s like a smaller version of the Kinect, except only incorporating gesture recognition in a smaller space. However, unlike the Microsoft Kinect and Sony PlayStation Eye/Camera, Leap Motion finds uses in a far wider range of applications since it’s connected to PCs. Leap Motion is also available as part of ASUS and Hewlett-Packard laptops and keyboards, incorporating the controller into the laptop chassis and thus eliminating the need for extra hardware. While mostly used for gaming applications currently, Leap Motion is quickly gaining popularity in design, social and creative applications, since it breaks the conventional keyboard and mouse usage to bring far greater control to the hands of developers and creators. There are already applications in which Leap Motion technology is being used to control robots, which opens up a whole new world of possibilities. Considering the fact that a Leap Motion controller costs just 100$ and there are already applications and games being developed exclusively for it, Leap Motion is perhaps the biggest step forward for gesture recognition technology. While the PlayStation Eye/Camera does a great job of advancing this, it can’t be argued that the lack of developer access and a closed ecosystem prevents it from reaching the heights it truly can. The Kinect, on the other hand, is already available for purchase, and the Kinect 2.0 will hit stores soon, with an SDK (software development kit) in tow for developers to truly play around and create something amazing. Leap Motion technology is now already finding its way to laptops and tablets, and is not even significantly expensive, so can prove to be a big player in the gesture recognition game. It’s truly exciting to see that gesture recognition technology is catching on, and manufacturers granting access to developer’s means that pretty soon we might see something totally amazing that even the creators themselves didn’t imagine. Man has always pushed the boundaries with technology, and the work of a few can actually change the world in this case. Perhaps a future where I would be typing this without even using a keyboard isn’t so far away, and perhaps a future where we don’t even need to physically be in touch with household appliances isn’t so far away. Most importantly, maybe a world where the differently abled will be able to do everything without extra help isn’t too far away either.