Tuesday, May 5, 2009

Duplicate Information/Articles with Google's new Algorithm

In years past most of the duplicate "information" on websites that hurt your S.E.O. mostly lied in the titles. Previous Google crawlers, bots, spiders whatever you want to call them, only really "had time" to check titles, that is why they developed the 301, 302, etc. re-directs and now the even more effective canonical tag. However, I recently experimented this and figured titles are not the only thing being crawled for duplicate information, but content as a whole.

It only makes sense that as time goes on, the crawlers for Google in 2000 would not nearly be as advanced as the ones we have in 2009. That simply content as a whole is being checked for spam not just titles or links, etc... For one of my clients I did an experiment on this, a blogspot blog and a wordpress blog. The blogspot I sent to multiple blog directories, and social bookmarking sites, and the wordpress I did that to many only 3 sites. I had made about 12-13 posts on the blogspot and I copied each and every post exactly the same including titles to the wordpress, which means they have similiar URL's.

One day I checked Google and the blogspot was on the front page with yelp.com then I checked later that day both yelp.com and the blogspot were pushed back to page 4. The wordpress was on page 2 or 3, it fluctuated and it was definitely above the blogspot. At that point I used,

http://www.webconfs.com/similar-page-checker.php

they were roughly 79% similiar. I went in and changed articles, keywords, tags, and titles, to a point of 19% similiarity.

Roughly a day or 2 later... blogspot was not just on page one but was in a decent spot on the front page. However the wordpress is seemingly non exsistent on Google searches. Now this is just one case and definitely something I am going to continue experimenting with, but becareful on article submissions and profile submissions. I am currently changing text for almost everything so nothing is really "identitcal" especially with Google's new algorithm original content is becoming more and more crutial.