How does object recognition work?
There are two main methods to conduct object recognition in AR, appearance-based methods and feature-based methods. Each method contains several different sub-methods.
As their names may suggest, appearance-based methods consider comparable and detectable features of objects and their similarity to templates or exemplars. The main challenge for this method is the simple fact that any single object may look completely differently depending on lighting conditions, distance or angle from which it is viewed, and even its age. This means that highly effective appearance-based system needs to have a large base of templates to use as a base of its evaluation algorithms, which presents obvious problems with storage space, time, and manpower necessary for accomplishing this task. The most commonly used approaches for appearance-based evaluation are edge matching, divide-and-conquer search, greyscale matching, gradient matching, histograms of receptive field responses, and large model bases.
Feature-based methods look for similar features in an imagined or ideal object and a real image. When we consider, for example, a face recognition, it is possible to program a set of features that are associated with the human face. Using these features, a software algorithm can generate a model that will be placed over the captured image. If some features of this object match the image we have a positive match. Common feature-based detection methods include interpretation trees, hypothesize and test method, pose consistency, pose clustering, invariance, geometric hashing, scale-invariant feature transform method, and speeded up robust features (SURF).
Google Glass was the first system to demonstrate how object recognition can be used in AR. First released by Google in February 2013, the system uses a special liquid crystal on silicon (LCoS), optical head-mounted display with 640x360 pixels resolution to display information from many existing Google applications, as well as those created by third-party developers. Google Glass is able to recognize an image in a scene, track its position, and augment the display by displaying appropriate information. This may be a map with navigation, price information for shoppers, office and business hours, or nutrition information of food items.
Development of augmented reality apps for Google Glass is made very accessible thanks to several SDKs that are currently available with Wikitude SDK being the most popular. The SDK enables developers to take advantage of built-in image recognition and tracking technology, position-aware services with geo-referenced data, embed videos from YouTube or Vimeo, and more. It allows anybody who is interested in augmented reality and object recognition to quickly develop a new application and showcase just how this technology could be integrated with our daily lives. Other solutions include, for example, CraftAR, which is a suite of tools created for agencies, publishers, and companies that would like to leverage technological possibilities of AR and IR to create ads, magazines, and catalogs. The technology works on all major mobile operating systems and supports the most popular programming languages and 3D engines.