What You Should Know About Google Canonicalization URL Issues

Canonicalization can be quite a complex area for webmasters. Internet marketers who design their own websites can be bewildered too when encountering this issue for the first time. Let’s take a look at what is canonicalization and how you can use it to solve URL problems.

Canonicalization is the process of standardizing URLs. Sometimes a domain name can have several variations but leading to the same page and this can cause a problem for search engines to determine exactly what URL version you’re using. For instance, Google can determine that the following URLs are not the same when in fact all of them lead to the same page.

http://www.yourwebsitename.com
http://yourwebsitename.com
http://www.yourwebsitename.com/index.html
http://yourwebsitename.com/index.html

Here are two reasons why this can happen:

(1) Inconsistent Internal Linking Pattern

You didn’t maintain a consistent internal linking pattern within your own site. If your homepage link has different versions of URLs linking to it like the above example, then you can face the canonicalization issue. Always decide beforehand what URL to use and be consistent. Also, use absolute link instead of relative link for your homepage link. Absolute link means using the domain name URL directly. If your homepage link is called “Internet Business Home”, hyperlink it directly to http://www.yourwebsitename.com instead of index.html which is a relative link.

(2) Link Building With Different URLs To Same Domain

Sometimes, even though your internal linking pattern is correct, you might encounter the canonicalization issue. For instance, by building inbound links to your domain using different URL versions. Google can get confused and determine that all these different URLs leading to the same domain are in fact different domain names. Sometimes this can be beyond your control as you don’t have any control who can link to you.

Pitfalls of Using Different URLs To Same Domain

Linking inconsistently whether on-site or off-site can have a negative impact on search engine rankings, PageRank and link juice distribution. Just think about this scenario. If you’re building links to http://www.yourwebsitename.com and also http://yourwebsitename.com or http://www.yourwebsitename.com/index.html, thinking it’s the same thing, well this is a big mistake as you’re wasting your time. You’re in fact reducing your link popularity power to your main domain which is http://www.yourwebsitename.com.

You won’t get maximum PageRank and link juice as well. If you get 30 links for each different URL, don’t think you got 90 links to http://www.yourwebsitename.com . It’s only 30 as the URL variations have compromised your link building efforts. So don’t expect to get the boost in rankings for http://www.yourwebsitename.com since you initially thought you got 90 inbound links.

Another downside is losing PageRank partially or totally to your main domain. Talking from experience, this is what I’ve encountered. I have a domain with PageRank 3 and out of nowhere, it dropped to PageRank 0. I knew it still has solid backlinks to it and I didn’t do anything to get it penalized. I was perplexed.

How Did I Know I was Affected?

The first thing I did was to check whether my domain was still indexed. After a check on Google with the site command “site:www.yourwebsitename.com”, I found my domain still listed but it didn’t have “www” in front just yourwebsitename.com.

I found that weird because it has always been indexed with “www”. I checked my other indexed pages and all have “www”. I made a thorough search on this issue and found that I had canonical URLs. My internal linking was consistent and I didn’t build inbound links to my domain using different URL versions so should be something else. I suspect I could have a few links with different URL versions which I didn’t build and Google mistakenly confused http://yourwebsitename.com as being my main domain instead of http://www.yourwebsitename.com resulting in my PageRank being dropped.

What Did I Do To Solve The Problem?

I think if I didn’t find a solution in time, http://yourwebsitename.com would have received the PR3 instead in the next update. Fortunately I found a quick solution after research. This is what I did.

I did a 301 Redirect from “Non-www” to “www” to tell Google that http://yourwebsitename.com and http://www.yourwebsitename.com are the same websites. This 301 Redirect will force resolution to only one URL. Here is what the 301 Redirect looks like. It’s actually a code.

============================================

Options +FollowSymLinks

RewriteEngine On

RewriteCond %{HTTP_HOST} ^yourwebsitename\.com$ [NC]

RewriteRule ^(.*)$ http://www.yourwebsitename.com/$1 [R=301,L]

============================================

You need to input this code in your .htaccess file in your server to instruct Google to do the 301 Redirect. Your .htaccess file is located in the same folder as your index file. Just download it and open it. To open .htaccess, simply drag-and-drop it in a text editor like Notepad or Metapad. Next, input the 301 Redirect code there and save. Now, just upload .htaccess to your server. That’s what I did and after some time, my domain was indexed again but this time with “www”. I also regained my initial PageRank.

The canonicalization URL is a pretty technical issue and many people can get confused with what’s happening to their website or blog. I hope this article has shed some light on this subject.

This is a guest post by Jean Lam, if you want to guest post check the guidelines here.

Image Credit: bull3t

Related posts

Google’s Take on Mobile Search and What Websites Must Do

Google’s 19th Birthday Surprise Spinner

What You Need to know about Google’s Featured Snippets?

21 comments

Free Picks April 17, 2011 - 3:50 pm
thanks for the good post, but as much as i know adding "/" sign at the end of url is good for SEO
John | Affiliate Marketing April 17, 2011 - 10:40 pm
Very interesting info, I had no clue about this. I also have done link building the right way according to this post but it's always nice to see what I'm doing is the right way to do it!
Jean April 18, 2011 - 1:08 am
Thank you for your comments. Glad it helps.
Anthony April 18, 2011 - 4:20 am
Building links on the same url pattern will help a site boost its ranking online. This is the reason why some bloggers build links to their pages instead of just the home page. It lets the page to get a better position on Google search ranking.
Free Landing Page April 18, 2011 - 3:33 pm
Great post for SEO! Nice solution to solve the problem with the site not getting indexed from Google.
Software Development Company April 18, 2011 - 6:21 am
Good explain about Canonical URL issues. 301 redirect is best option for that.
Ikranic April 18, 2011 - 2:44 pm
Thanks Jean Lam. I think You helped me a lot by publishing this site.
Alex April 18, 2011 - 4:24 pm
Not many people know that a www and a non-www version of the same page exist and they unwittingly create duplicate content and may incur a small duplicate content penalty. You could also use canonical tag and set a default url for all your pages, but the redirect is the best way to go.
Jean April 21, 2011 - 6:29 am
That's so true, sometimes many people are unaware of this. This seemingly "minor" issue can cause problems. Thanks for your valuable input.
Ionel Roman April 19, 2011 - 6:36 am
Now I know why some of my websites dropped in PR, although I did nothing "wrong" :D Thanks for this great info mate! I'll go into the htaccess file right away and do the necessary changes Keep'em coming
Jean April 21, 2011 - 6:35 am
Thanks for your valuable input :). Sometimes if you experience a drop in PageRank say from PR 4 TO PR 2, it might not necessarily be a problem with canonical URL. One other reason can be you lost some high PageRank backlinks, hence naturally your page will lose PR. But it's always a safe precaution to do a 301 Redirect to prevent any potential future problems with canonical URL. That sudden drop in total PageRank was weird for me and I know it wasn't about backlinks.
Techabouts April 21, 2011 - 1:03 am
Thanks a lot specially for 301 direct.I never knew that problem before.
AJ Clarke April 21, 2011 - 1:03 am
Oh yea, Canonicalization can really mess up your page rank and I've had lots of clients looking for SEO help and when i approached their site I saw some huge issues in terms of their URL structure. In my opinion the best ways to keep this in line is the following... 1. Using the rel="canonical" tag in your (most important) 2. .HTACCESS for redirections like you mentioned. 3. Having a good breadcrumbs navigation across your whole site Of course there are more, but these will help a lot, pretty much just the rel=canonical" can solve lots of issues. That's why Google added it - Ask matt cutts if you don't believe me :)
Jean April 21, 2011 - 6:45 am
Thanks for your valuable input. Yeah there are several ways for sure to tackle the canonicalization issue. Oh what a word, difficult to write and spell :). Some other solutions include submitting a proper sitemap with all your canonical URLs telling Google these are your authoritative urls OR using Google Webmaster Tools to specify which url version you want ie "www" or "non-www". I think "Best Ways" is subjective. Not all people have the same exact problem and experience. It can be a different issue for a different site for instance. There are several intricacies involving canonical URLs. There are certain cases when using a permanent 301 Redirect is advised over the rel="canonical" and vice versa. According to Matt Cutts, there is no big difference as to their efficiency and effectiveness. Both can be used to solve this issue generally. But some differences can include: (1) rel="canonical" is restricted within the same domain whereas 301 Redirect can be used to cross domain as well. (2) rel="canonical" is client-side while 301 Redirect is server-side. (3) rel="canonical" code looks simpler than the 301 Redirect so it can be more appropriate for untechnical people. As a general rule of thumb, I think the ultimate solution to canonical URL issues would be to take as many precautions to prevent it from happening in the first case :) like using consistent URL navigation structure(breadcrumbs navigation like you mentioned), using absolute URL for homepage link and always ensure that you allow others to link to your site using your canonical url. But for sure, sometimes you can't control how people link to you so in that case, need to take appropriate measures. At least you've done your homework on your part.
AJ Clarke April 21, 2011 - 12:54 pm
You're right. Preventative care is always best. Whenever you face a site that has canonical errors it can be a lot of work to get it back on track and it may take a lot of time for Search Engines to re-index your site correctly. Well, you can't have a consistent rel="canonical" without consistent URL formatting. So by focusing on your rel value it will force you to take a look at your URL's, especially in a dynamic setting such as WordPress. If you can get people to link to your canonical URL that is great, but otherwise I don't think it's too much of an issue as long as your redirects are in place.
Jean April 21, 2011 - 2:45 pm
Yup that's true. That's why apart doing preventative care, it's wise to do the redirects as well beforehand even though everything is fine for now. Thank you for your valuable comments.
Jean April 22, 2011 - 7:48 am
Thanks for your input Kavya :).
AJ Clarke April 24, 2011 - 3:49 pm
Neosys, Does shameless self promotion on blog comments help? and no...adding your site to directories doesn't help with canonicalization. Actually, it's pretty much pointless unless it's dmoz.
Jerrick April 27, 2011 - 10:51 pm
i get it but i need do backlink with those other URL for the same page for me to track the traffic. Now i know it will harm to SEO. Sometime a URL which too long will affect the loading time as well for the page and sometime even worst that become page not found. URL do important as well .
Harsh Agrawal May 1, 2011 - 1:55 pm
Nice insight Jean.. Good thing is Wordpress takes care of Canonicalization by default..but if you are working on a static site or any platform, this is something which one should be taking care of..Else accidently they created a replica of your site and thus duplicate penalty...
diet reviews August 20, 2011 - 9:39 am
But checking your page source you have a number of things using absolute addressing to the www variant so it may be less painful to chose that one.
Add Comment