The Beginner’s Guide to Duplicate Content
One of the most frequent challenges I come across as a digital marketer are clients who can’t seem to get a good grasp of what duplicate content really is, how to avoid it, and why it matters to them.
In this article, I’m going to dispel a few myths about duplicate content and SEO that are still lingering in a post-Panda world, as well as giving a few tips as to how to keep on the right side of Google’s guidelines so that search engines and users love your content.
What is duplicate content?
From the horse’s mouth, a Google Search Console Help Centre article states:
“…substantive blocks of content within or across domains that either completely match other content or are appreciably similar.”
Which, doesn’t seem so difficult, but what we need to know is how does this affect your website?
Some examples of duplicate content include:
Ecommerce product descriptions. Specifically, generic descriptions provided by a supplier and used across multiple sales outlets. For example, this section on the Nespresso website about a coffee machine...
...has been repeated word for word on Amazon India to sell the same product:
Use of the same page in multiple areas of your site. Again, this is usually a problem for ecommerce sites, e.g. you'll see:
which has the same content as:
Multiple service pages on your website which are too similar to each other.
Your site doesn’t handle the www and non-www versions of your site effectively.
You use another website’s content on your own site. Press Releases are a good example of content that is written once and distributed multiple times. Another would be sites that syndicate content and publish nothing original.
You own several domains that sell similar product lines to different target audiences – to both consumers and trade for example.
Why should I care about duplicate content on my website?
Let’s dispel the biggest myth that still gets circulated, the Google penalty myth. Here's the truth: There is NO Google penalty for duplicate content.
This was addressed in a Google Q&A in June of last year. You can watch the whole video here.
However: Google MAY prevent some of your content from showing as a search result if your site has duplicate content issues, and as with all content, it will aim to show the most relevant content to the user at the time.
Google will still index those pages. If it can see the same text across several pages and decides they are the same, it will show only the one which they deem to be the most relevant to the user’s own query.
There is a distinction between content which has been duplicated by your CMS generating new URLs, for example, and users who replicate content on a large scale and re-publish it for financial reward, or to manipulate rankings.
Google’s Guidelines for quality are clear on this subject. If you use illicit tactics for generating content, or create pages with no original content, you do run the risk of being removed from search engine results pages (SERPs).
In ordinary cases such as those listed above, the worst that will happen is your site simply won’t be shown in SERPs.
How to check for duplicate content on your site
There are several tools which will help identify areas to improve on your own site such as:
Moz’s crawler tool will help you to identify which pages on your site are duplicate and with which other pages. It is a paid tool but it does have a 30-day free trial available.
Siteliner will give you a more in-depth analysis of which pages are duplicated, and how closely related they are and which areas of text are replicated. This is useful where large bodies of text are used but the whole page may not be a complete replication:
Copyscape’s plagiarism checker will also check for copies of your pages being used on the wider web:
If you can’t access these tools for any reason but are concerned that duplicate content may be influencing your site, try selecting a snippet of text and searching for it to see if any direct duplications are returned in the results.
What to do about content duplications?
This really depends on the type of duplication. Some of the techniques I’ll talk about now aren’t really for the beginner. You may need an SEO agency to hold your hand through this part of the process.
The Problem: Generic product descriptions provided by a supplier
The fix: This one is easy to tackle, but can be resource heavy. The advice is about as simple as it gets; make your content unique, useful, and interesting for your audience. Usually a manufacturer’s description will tell you what the product is, whereas you need to think about why your customer needs it and why they need to buy it from you.
There’s nothing stopping you from using the specification of a product and then adding your own wording around it. Add in your tone of voice and personality. Think about your specific audience and their personas. Think about why they would want to buy your product and then tell them your unique selling proposition. What problem or need does it satisfy that they will relate to?
The Problem: Same page in multiple places on your site
The fix: In this instance, you should include a canonical URL on the duplicated pages, which refer to the original as the preferred version of the page. In my ecommerce example where a red jacket appears in both “sale” and “jackets” categories one of them should include a canonical link in the code of the page to acknowledge the duplication. An example would be as follows:
On the jacket contained in the “Sale” page:
The Problem: Service pages on your website which are too similar to each other
The fix: There are a couple of options here. You can try and make the pages sufficiently different, however if the pages are largely around the same subject with only slight differences, you may be better served using just one page to talk about both subjects. I would advise removing the least valuable page and apply a 301 redirect back to the most valuable page. One valuable page is certain to be more successful than two weak or conflicting pages.
The Problem: Your site doesn’t handle www. and non-www. versions of your site effectively
The fix: The easiest way to test for this is to remove the www. portion of a URL on your site, in your browser and see what happens when you try to load the page. Ideally a redirect should take place from one to the other.
Note: it doesn’t matter which you go with, just pick one way and be consistent. Also make sure you have identified your preferred version in Google Search Console.
The Problem: You use another website’s content openly on your website
The fix: This scenario tends to happen if you use press releases or if you use feeds to populate certain areas of your site, to show the latest events in a specific region, for example.
There’s no real hard and fast rule to this. If you are sure that this type of content provides value to your users you can either accept that you’re never going to rank well for that content (but the rest of your site might) or you can take the time to make the content unique to your audience.
The Problem: Having two websites selling the same goods to different audiences
The fix: This one is somewhat complex. The best way to combat this, from a search point of view, is to combine your online presences into one site. There may be good business reasons for having two separate brands which cater to different audiences. You still need to be aware that they will ultimately be competing for attention in the search engine results pages (SERPs).
Simply adhering to Google’s quality guidelines will help. Create content which is useful, credible, engaging and, wherever possible, unique.
Google does a decent job of spotting unintentional duplications but the tips above should give you an idea of how to get search engines and users to understand your site.
About the author
Jean Frew is a Digital Marketing Consultant at Hallam, specializing in SEO. Jean has worked in Ecommerce and Digital Marketing since 2007 and is experienced in driving online growth, as well as managing budgets and projects of all sizes. She has a broad knowledge of Digital Marketing and utilises analytics to make data-driven decisions.