WordPress Duplicate Content

// // April 27th 2011 // Rant + SEO + Technology

In February Aaron Bradley sent me an email to let me know that I had a duplicate content problem on this blog. He had just uncovered and rectified this issue on his own blog and was kind enough to give me a heads up.

Comment Pagination

The problem comes in the way that WordPress handles comment pagination. The default setting essentially creates a duplicate comment page.

Here’s what it looks like in the wild. Two pages with the same exact content.

http://blog.wolframalpha.com/2011/04/18/new-age-pyramids-enhance-population-data/comment-page-1/

http://blog.wolframalpha.com/2011/04/18/new-age-pyramids-enhance-population-data

That’s not good. Not good at all.

Comment-Page-1 Problem

The comment-page-1 issue offends my own SEO sensibilities, but how big of a problem is it really?

WordPress Spams Google

There are 28 million inurl results for comment-page-1. 28 million!

Do the same inurl search for comment-page-2 and you get about 5 million results. This means that only 5 million of these posts attracted enough comments to create a second paginated comment page. Subtract one from the other and you wind up with 23 million duplicate pages.

The Internet is a huge place so this is probably not a large percentage of total pages but … it’s material in my opinion.

Change Your Discussion Settings

If you’re running a WordPress blog I implore you to do the following.

Go to your WordPress Dashboard and select Settings –> Discussions.

How To Fix Comment-Page-1 Problem

If you regularly get a lot of comments (more than 50 in this default scenario) you might want to investigate SEO friendly commenting systems like Disqus, IntenseDebate or LiveFyre.

Unchecking the ‘break comments into pages’ setting will ensure you’re not creating duplicate comment pages moving forward. Prior comment-page-1 URLs did redirect, but seemed to be doing so using a 302 (yuck). Not satisfied I sought out a more permanent solution.

Implement an .htaccess RewriteRule

It turns out that this has been a known issue for some time and there’s a nice solution to the comment-page-1 problem in the WordPress Forum courtesy of Douglas Karr. Simply add the following rewrite rule to your .htaccess file.

RewriteRule ^(.*)/comment-page-1/ $1/ [R=301,L]

This puts 301s in place for any comment-page-1 URL. You could probably use this and keep the ‘break comments into pages’ setting on, which would remove duplicate comment-page-1 URLs but preserve comment-page-2 and above.

Personally, I’d rather have the comments all on one page or move to a commenting platform. So I turned the ‘break comments into pages’ setting off and went a step further in my rewrite rule.

RewriteRule ^.*/comment-page-.* $1/ [R=301,L]

This puts 301s in place for any comment-page-#. Better safe than sorry.

Don’t Rely on rel=canonical

Many of the comment-page-1 URLs have a rel=canonical in place. However, sometimes it is set up improperly.

Improper Rel=Canonical

Here the rel=canonical actually reinforces the duplicate comment-page-1 URL. I’m not sure if this is a problem with the Meta SEO Pack or simple user error in using that plugin.

Many times the rel=canonical is set up just fine.

Canonical URL from All-In-One SEO Pack

The All in One SEO Pack does have a Canonical URL option. I don’t use that option but I’m guessing it probably addresses this issue. The problem is that rel=canonical doesn’t stick nearly as well as a 301.

Comment-Page-1 in SERP

So even though this post from over three months ago has a rel=canonical, the comment-page-1 URL is still being returned. In fact, there are approximately 110 instances of this on this domain alone.

Comment Page 1 Site Results

Stop Comment-Page-1 Spam

23 million pages and counting. Sure, it would be nice if WordPress would fix this issue, but short of that it’s up to us to stop this. Fix your own blog and tell a friend.

Friends don’t let friends publish duplicate content.

Postscript: Leave A Comment // Subscribe (RSS Feed)

The Next Post:
The Previous Post:

4 trackbacks/pingbacks

  1. Pingback: Express newspaper creates an infinite number of URLs using rel = canonical » Malcolm Coles on May 6, 2011
  2. Pingback: أرشفة التعليقات : زيادة الزوار أم تكرار المحتوى؟ on July 27, 2011
  3. Pingback: Evitar duplicados por comment-page-1 en WordPress on November 20, 2012
  4. Pingback: Eliminare i Commenti… per Posizionarsi Meglio? on October 27, 2015

Comments About WordPress Duplicate Content

// 19 comments so far.

  1. Aaron Bradley // April 28th 2011

    Thanks for taking the time to do what I didn’t, and put a great how-to post together for the benefit of all!

    A curious problem, insofar as one would have thought that a properly thought-through pagination scheme wouldn’t replicate the actual post under a different URL (let alone do so when in situations where there weren’t enough comments to require pagination).

    The Karr workaround is a pretty good hack, but maybe WordPress will address this in a future release (or some frustrated developer will write a plugin to take care of this more elegantly).

  2. Cindy T. // June 09th 2011

    Great post, thank you very much. I will give these a try!

  3. Noa Noa // July 02nd 2011

    You’ve explained something that is quite complicated very well. When it comes to duplicate content sometimes its hard to know who to believe. However this is such a simple change to make that its a no brainier just in case it does make a difference

  4. keith // July 11th 2011

    This is an old problem (from 2.7) and you should use standard seo plugins to address them.check out Platinum SEO and it does this amazingly well.

  5. Jude // August 11th 2011

    Duplicated content is really annoying when it happens automatically. I was lucky in that I was warned about this before I made the mistake. It is ludicrous that it was built in as a possiblity.

    In terms of anti-spam, I find disqus is by far the easiest to use.

  6. AJ Kohn // August 12th 2011

    Jude,

    It does seem strange that it was built like this and that someone like Google hasn’t tapped WordPress on the shoulder and asked them (not so kindly) to fix it. And another vote for Disqus. I’m finding there is a very big division out there on the use of third-party commenting systems. I guess that shouldn’t surprise me since comments are so valuable.

  7. Alan // November 04th 2011

    Hi,

    Thanks for the post…

    You said: “Unchecking the ‘break comments into pages’ setting will ensure you’re not creating duplicate comment pages moving forward. Prior comment-page-1 URLs did redirect, but seemed to be doing so using a 302 (yuck). Not satisfied I sought out a more permanent solution.”

    I just tried unchecking the break comments on 2 sites & the “/comment-page-1’s, 2’s, etc” all 301 redirected to the canonical by default. No need to add a 301 to the .htaccess from what I can tell…

    I checked using Live HTTP Headers. Perhaps I am missing something? Would love to hear your thoughts. Thanks.

  8. AJ Kohn // November 07th 2011

    Alan,

    When I first investigated this I was told that changing that setting would result in a 301. For whatever reason it did not do that for me so I embarked on the .htaccess solution. But if you’ve validated with Live HTTP Headers that it’s performing a 301 instead of a 302 then I think you’re safe. No need to crack open your .htaccess file.

  9. Alan // November 07th 2011

    Hi AJ,

    I appreciate your reply! Thought I might be missing something…

    From what I can tell…301’s are in place from within WP, so that’s good news.

    Have a good day!

  10. Shabnam Sultan // February 20th 2012

    Hi,

    I am trying to redirect by implementing htaccess RewriteRule as per the post but 301 redirect of comment page is not happening.

    Am i wrong somewhere?

  11. Bill Ray // May 23rd 2012

    Duplicate content, can be a big issue, I fixed my .htaccess file so it works okay!

  12. WebDSE // October 14th 2012

    I agree with Keith. It is a old problem and it just needs a little attention. Platinum SEO is the best and easiest solution to this problem.

  13. cnxsoft // February 09th 2013

    Would you know how to write those rewrite rules for nginx?

  14. AJ Kohn // March 06th 2013

    Not offhand. Sorry.

  15. Min Min // February 23rd 2013

    Unfortunately, the single post page is the same as the LAST comment-page (newest one) e.g. comment-page-5 but not comment-page-1 if the “LAST” page displayed by default. So RewriteRule ^(.*)/comment-page-1/ $1/ [R=301,L] is NOT right.

    Also, once you don’t break comments into pages, they’ll redirect to single post pages and the single post pages have canonical, so 301 is unnecessary, because comment-page no longer exist.

    But, thanks, I de-select break comments into pages, and this fixed all. I also think it’s good for ranking single post pages when the content is rich and usually no one will scrape your comments 🙂 so your content can be safe to be original.

  16. AJ Kohn // March 06th 2013

    Min Min,

    I’m glad you deselected that option but I don’t understand your issue with the rewrite rule.

    The one I’m using is RewriteRule ^.*/comment-page-.* $1/ [R=301,L] so you can type in any comment-page-# and it will redirect to the actual post.

  17. Anuja // November 16th 2013

    You are simply great. Thanks for the great tutorial for comment pagnation. Worth reading!

  18. Md Saiful Islam // December 21st 2013

    I don’t understand how to change title or description for comments-page-1 or other comments page?

  19. pkygola // May 13th 2018

    Hello,

    If I don’t want to index mydomain.com/page/1/, mydomain.com/page/2/ and so on…. what should I do?

    Do I need to use robots.txt file or can be be done via rel=cononical ?

    But the mydomain.com/page/1/ and series of these pages are not editable, so where can I put the rel=canonical tag?

Sorry, comments for this entry are closed at this time.

You can follow any responses to this entry via its RSS comments feed.

xxx-bondage.com