A few weeks ago, a team of Google researchers presented a project that may just revolutionize image search. See, during the last few years, image search has been rapidly growing in popularity. Development in this field, however, has been slow at best. Right now search results are still dependent on textual clues like filenames, image title, and the accompanying body text on the webpage. The new research tries to improve on this by allowing key features of the image itself to be part of the search criteria.
This used to be deemed impossible because of computational cost. Let me explain. Text-based search is relatively easy because words are composed of only a few bytes, and algorithms have been thoroughly refined for fast high-volume searches. Meanwhile, images are usually tens of kilobytes to hundreds of kilobytes each. Sifting through the millions of images online takes considerably much more time. The proposed solution in the paper presented is a hybrid of text and image-based search to solve this problem. The process goes like this:
First, the current Google Image Search algorithm, which uses text clues, is used to filter out 1000 of the most relevant search results. Next, this more manageable volume of images will be processed to look for similarities. Google will try to find the patterns most commonly shared among the population. A system called VisualRank will determine “authority”. The image with bears features repeated most often among all other images will rank the highest. In a sense, it’s not much different to PageRank, a tried and tested Google algorithm which uses links as primary determinant. The key to rank well, it seems, is to isolate the least common denominator in your photos. As they refine the algorithm, getting good rankings is bound to get more complex as well.
Results have been very encouraging. A comparison of the current Google Image Search with the new method showed that a significant majority of users preferred the latter. The results it churned out contained less irrelevant images, increasing user satisfaction. Although this method still has some issues to work on and has yet to see the green light for deployment, it has a tremendous potential for product image searches. This makes it worth keeping an eye on for any serious web worker.