Get Ready For Some Caffeine

Posted by Malcolm Slade on Friday, November 20th, 2009 in Google, SEO

CaffeineBack in August, we reported on the news from Google that a major infrastructure update was in the pipeline codenamed Google Caffeine. Due to the scale of this infrastructure update, Google even released a preview sandbox for us all to investigate and provide feedback. This sandbox not only allowed our analysts to get an insight into what may be changing in the Caffeine update but it also allowed Google to gather a lot of test data and feedback (something most other tech companies often fail to do).

Well, the public testing is now officially over and Caffeine has been placed behind closed doors prior to launch, meaning it’s time to review what we learnt and what we can expect from Caffeine in 2010.

Caffeine-Thanks

Visiting the Caffeine URL (www2.sandbox.google.com) now get you forwarded to http://www.google.com/errors/caffeine/unavailable.html which displays the above message.

Soon?

The statement that caused some panic was of course “Soon we will activate Caffeine more widely, beginning with one data center”. As anyone who has been in online marketing for a while will tell you, historically major Google updates can cause significant fluctuations in traffic while they settle into place. Not something you want to be experiencing over the holiday period where online purchases go through the roof (fingers crossed).

Luckily Google’s Matt Cutts pre-empted this concern and stated;

“Caffeine will go live at one data center so that we can continue to collect data and improve the technology, but I don’t expect Caffeine to go live at additional data centers until after the holidays are over. Most searchers wouldn’t immediately notice any changes with Caffeine, but going slowly not only gives us time to collect feedback and improve, but will also minimize the stress on webmasters during the holidays.”

Full post by Matt. We all breathe a sigh of relief.

So we can expect Caffeine to be rolled out onto a single data center shortly and hopefully the location of that data center will be made public. Google’s uses many data centers so impacts from any negative side effects or positives are likely to be minimal until late January 2010 where we expect to see further expansion.

I will add the location of the data center (IP address) to the post once it is known.

What to Expect

The current architecture that underpins Google is around 10 years old and is updated in sporadic chunks to improve the performance of various elements and add new functionality. As with most complex systems, there comes a time when it is necessary to completely rework what you have to create an even stronger platform for further growth and development. This is Google Caffeine.

At this stage, Caffeine is not a full-blown algorithm update although there has been a lot of speculation regarding things that may have changed. Caffeine is in fact documented as a complete overhaul of the infrastructure that sits under the hood of Google’s search engine. Google’s Official Blog Post.

With regard to Search, this new infrastructure is designed to speed up the rate in which Google can index the billions upon billions of web pages, files, movies and pictures available to it while also increasing the amount of items indexed and the rate at which results are returned. A larger index allows returned results to be more comprehensive and accurate while quicker indexing allows results to be kept fresh and up to date.

Of course a larger index is going to have an impact on a large number of sites. If Google suddenly crawls your site and finds 10,000 extra pages that it had previously ignored, things will change. Likewise if Google suddenly finds a number of ignored pages on other sites that link through to you, things will change. What if Google suddenly finds 3 resources that are better than yours? Things will change.

Test Results

We carried out testing at various stages of the 3 month period where the sandbox was available to gain an insight into things to come. Two things worth mentioning are that we had to use various methods to get comparable results (US proxying, modifying URLs etc.) and the Caffeine sandbox results are slimmed down (no PPC etc.) so currently conclusions have to be taken with a pinch of salt until we can get our hands on the real deal.

Index Size
At the start of testing, the Caffeine index was noticeably bigger than the current Google index. Although we never saw results close to the 7 fold increase seen by the guys at Mashable we did see a noticeable size difference in the amount if results returned. Over the 3 months, the Caffeine index grew slightly but the distance between Caffeine and the current Google index actually shrunk. The number of results returned by Google.com today for Dog is 345,000,000. A lot closer to the 359,000,000 seen by Mashable on Caffeine.

Dog-Results

Result Speed
Like others, we saw an increase in speed. This could be explained by the fact that the results set didn’t feature PPC, sitelinks etc. But like the index size, the gap seemed to close which can again be seen from the above example.

Result Accuracy
We didn’t notice any drop in poor quality results worthy of note, not for want of trying and in general, results were very similar across the board. There was a time where social media sites seemed to be getting a boost for certain queries but this again seems to now be reflected in the main index as well.

Conclusion

We are not expecting the initial launch of Caffeine to have a major impact, as at the close of the public sandbox, results were very similar to current results. That being said we have learnt to never take anything for granted. What will be more interesting are the subsequent updates that take place on the back of the Caffeine rollout. Real time, trust, links and content are all still very much at the top of Google’s mind and Caffeine looks set to give them the foundation to delve further into these issues.

Exciting times ahead but for now keep adding worthwhile content, keep engaging and growing your community and keep the links coming.

We would be very interested to hear from anyone else who looked at the Caffeine sandbox closer to the end of its availability and look forward to the day when an IP address is confirmed.

  • Digg
  • Sphinn
  • StumbleUpon
  • Reddit
  • Technorati
  • del.icio.us

Respond to 'Get Ready For Some Caffeine'