Canonical URLs

109 Comments

Duplicated content across multiple pages hurts your search engine relevancy. But what do you do when the same page can be accessed with multiple urls? Perhaps your page allows the use of query string variables? Or your page path works both with and without a trailing slash? Or in the case of concrete5, it also works with the page's unique id. Fortunately Google and Yahoo have come up with a great solution to this problem: the Canonical Link tag

Implementing a canonical urls on your concrete5 website is pretty straight forward. Just add the following snippet to your theme's header include:

<? $cPath = $c->getCollectionPath();
$canonicalURL = BASE_URL;
$canonicalURL.= $cPath;
$pageIndentifierVars = array('keywords','fID','tag','productID');
$canonicalVars = array();
foreach($pageIndentifierVars as $var)
if($_REQUEST[$var]) $canonicalVars[]= $var.'='.$_REQUEST[$var];
if( count($canonicalVars) ) $canonicalURL.= '?' . join(',',$canonicalVars);
?>
<link rel="canonical" href="<?= $canonicalURL ?>" />

Note that in the above example I'm also adding a few key query-string variables to the canonical url if they're present. This is because I want the search engines to treat each url with unique versions of these parameters to be treated as separate pages.

Like this post?
Tweet This

Enjoy this post?

Comments:

By sean
Thanks Tony! I'm looking forward to getting this to work now that I've got my Pretty URLs up and running. Where do I find my "theme's header" and whereabouts would you recommended putting this code inside of it? Cheers!
By Tony Trupp
have a look within /themes/your_theme_name/ or /packages/your_theme_name/themes/your_theme_name/. Not every theme will be guaranteed to have a header include, but it's good practice to do so, to not duplicate header code in each page type. anywhere before the end of the tag is fine.
By sean
Thanks! I found themes/default/elements/header.php But now I have a few questions related to this and Pretty URLs duplicating my content.

1) From what I understand, I should also do a 301 Redirect from the duplicate (index.php) URLs to the Pretty (non-index.php) URLs... which will then pass through the mod_rewrite in the .htaccess to eventually read from a path equal to the duplicate URLs. I've looked into the Apache mod_rewrite guide, but anything I try causes an error message saying there's too many redirects going on... I can't help but agree with it. Any suggestions on how to go about this properly?

2) Is this in some way already being achieved by canonizing the Pretty URLs as you've done, as far as crawlers are concerned? And do I need to edit the above code to specifically target the "index.php" in the duplicate URLs?

3) Can I do a PHP canonical redirect with this same header include? Bringing the "non-www" version to the "www" version of my site. Or should that be done in the .htaccess file by mod_rewrite as well? I seem to be able to get away with that without causing "too many redirects," but just thought I would ask you for your opinion.

4) Are the "query-string variables" you've included recommended for everybody or is it just to demonstrate that this is possible? What else in the tag should be omitted if they are unnecessary?

5) I know I've already asked you 4 rather complex questions... there was even like a half question or two stuck in there... but could you please speak to what is being stated in your rel="canonical" link tag by "%3C?=$canonicalURL%20?%3E"
By Tony Trupp
if you add this canonical tag, then the redirect won't be necessarily to stop search engines from thinking it's duplicated. You can still do the redirect if you want, but if I were you I'd do it within /config/site_process.php instead of with apache. There should be some tutorials in the concrete5.org docs for this kind of thing. no, you don't need to do anything different for each url variation. the querystrings are just to demonstrate what's possible.
By sean
Thanks again for your help - and the incredibly fast responses. Greatly appreciated.
By Tony Trupp
looks like you're not closing that link tag. and the html encoding on this post got messed up before. it should be corrected now above.
By Ian
Thank you for posting this solution. Worked like a charm, and google loves it.
By Jay
Hi, when upgrading Concrete5 such as from 5.5.1 to 5.5.2, does the upgrade remove this snippet? I had added it to my header. But after the upgrad, I don't see it anymore. Makes me wonder how many other codes I lost???
By Tony Trupp
You shouldn't loose customizations done within your header include while upgrading to a new version of concrete5 as long as you didn't make the changes anywhere within the /concrete/ or /updates/ folders.
By Tim
I threw this snippet of code in all of my C5 websites about 2 years ago. All you have to do is dump it into the elements/header_required.php and you will be good to go. I haven't had a single error with Google finding duplicate pages since that time.
Add a New Comment
(will not be made public)
(optional)

Please type the letters and numbers shown in the image.Captcha Code