Friday, August 13, 2010

Google using Caffeine to stay awake at all times

Google recently announced the completion of a new web indexing system called Caffeine. Google claims that Caffeine will provide 50 percent fresher results for web searches than their last index, and it's the largest collection of web content they've ever offered. This so called next generation Google search engine, “Google 2.0”, is designed specifically to compete with the likes of Bing and Facebook.

But before we discuss that how does this new search engine indexer will going to affect the health of the current users, publishers & power searchers, let me explain what exactly does this search engine indexer does.

Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. So when one searches Google, he/she is not searching the live web. Instead they are searching Google's index of the web which, like the list in the back of a book, helps one pinpoint exactly the information they need. So all of you, who think that what they see in Google search results shows all the information that exists on web, are totally wrong. In fact it is the information that Google can find through its search engine.

This new indexing method - "Caffeine" is nothing but a more exhaustive and continuous cataloging of the vast web. The earlier web crawling and indexing system was carried out in layers. While the main layer was indexed once in two weeks, the other layers wouldn’t update uniformly. Refreshing a single layer would involve analyzing the web in its entirety. A user searching Google may not get the most recently posted articles, updated pages or Twitter or Facebook conversations, because of the delay between finding the updated pages and indexing them, thereby making it unavailable for users in real time. With Caffeine, Google analyze the web in small portions and update their search index on a continuous basis, globally. As Google find new pages, or new information on existing pages, they add those straight to the index. That means user can find fresher information than ever before—no matter when or where it was published.

Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. Most users won’t notice a difference in search results, but web developers and power searchers might notice a few differences.

This Google update is huge from a search engine optimization and link building standpoint because every time new algorithms and infrastructure changes like this happens, the old SEO techniques are no longer relevant & hence it makes the lives of the current SEO gurus more challenging. In my next post I'll talk more about how SEO can be done for this new infrastructure from Google.

No comments:

Post a Comment