Pocket Supercomputer

Pocket Supercomputer

Video-equipped cellphones offer limitless computing capabilities into mobile phones through low-latency streaming and real-time video analysis. Combined with augmented reality this provides users real-time information, translations, ratings, and videos of the products filmed by the mobile device. Two patents were filed and there was extensive international media coverage.
This technology, founded by Fredrik Linaker, offers a simple way to find useful information about the surrounding world using any ordinary 3G cellphone equipped with a video camera. The prototype system, dubbed the Pocket Supercomputer, offers a simple way to seek out useful, hard-to-find information.

SIFTing objects

If a user points the phone’s camera to a foreign food item, the system can automatically identify ingredients that might cause an allergic reaction. Similarly, when shown a book, it can quickly perform an online price comparison, or find a review (see the video). Live video footage is fed from the handset to a central server, which rapidly matches on-screen objects to images previously entered into a database. The server then extracts feature points from the images and matches against images in the database. Relevant information is sent back to user. 

By offloading the processing from a mobile device onto a server, there are few limits on the size and processing power available to be used for the storage and search of images.

For example, the technology can recognise an image of a painting; using an image of Vermeer’s Girl with a Pearl Earring. When the camera on the phone take video footage of the painting, a search of the database returns results giving information about the painting and links the phone to the movie of the same title.

Foodstuffs can be identified by their packets, even if the name of the foodstuff is written in non-Latin characters, such as the Chinese pack of soup seasoning pictured above.



Foreign languages and characters can be translated into the user’s language, so a user can find out what an object is. Search results can be personalised, so the user can be alerted if a foodstuff contains a certain allergen, for example.

Businesses can use the application for inventory purposes, or to train staff to recognise different electrical components.

The central server uses an algorithm called the Scale-Invariant Feature Transform (SIFT) to match objects. The algorithm uses hundreds or thousands of reference points, corresponding to physical features such as edges, corners or lettering, to find a match. The process works no matter how the object is oriented, but objects must first be carefully imaged and entered into the central database. A three-dimensional image of an object can also be uploaded onto the phone, to look at the virtual object from different angles.

Database of images

Creating a database containing 5000 items takes about a day, although it then takes just a few milliseconds to match an object. Eventually you could imagine having one enormous general purpose database. Advances in image recognition have prompted several other companies to research similar cellphone search technologies. Microsoft has a system called Lincoln, that lets users to take snapshots and send them off for identification. Another system developed by Evolution Robotics of Pasadena, California, called ViPR, also uses video footage to identify objects, and is already available in Japan.

The database that is used for video search can be built automatically; spiders have been written to crawl the web and download images on a specific theme such as Asian food, google image search, or all Amazon products. The need for custom databases led to another project : Automated Visual Repository Building.

Robot navigation Krystian Mikolajczyk, a computer vision researcher at the University of Surrey, Guildford, UK, says it is preferable for image processing to be done remotely. “It’s hard to store a large database on the cellphone,” he says. “It is also difficult to propose generic software for any brand and model.” “This is the type of application for which SIFT was developed,” adds David Lowe, who developed the SIFT algorithm in 2004, and who is a computer vision expert at the University of British Columbia in Vancouver, Canada. “I think it will be a useful for cellphones – a very convenient method for users to get information about the objects and locations in their surroundings.”

International Media Coverage

Two patents were filed for this project and it has received extensive international media coverage:

Computing-co-uk article

Computing-co-uk article

Make-iTV article

Make-iTV article

MobiFrance article

MobiFrance article

NewScientist article

NewScientist article

RealSEO article

RealSEO article

Slashdot article

Slashdot article

ZDNet article

ZDNet article

Leave a Comment