Optimizing your website to be well referenced requires a set of criteria to be respected. From content to internal mesh through the various tags, on-page optimization of a website requires following a few rules. One of the main criteria for SEO is the quality of your content. Google favors sites that have developed a strategy of qualitative, rich and unique content. Duplicate or close content is therefore strongly penalized….

What is duplicate content?

This is content that appears more than once on a website, both inside and outside the same website. Duplicated content is penalized because it is a problem for crawlers to know which content is the original and which part is duplicated. It will then be difficult for them to deliver the correct information during a user request.

Training & Co'm

Google is increasingly encouraging websites to offer a better user experience, but duplicate content is against this trend. This is why Google robots will not display all duplicate content and will therefore be forced to choose the most relevant. The problem with this type of content is that it leads to a loss of quality and therefore of traffic. While not all types of duplicate content are as dangerous, they can lead to 3 types of problems:

  • A loss of efficiency of link metrics (trust, text anchors, link juice, authority), because it is difficult to redirect them to the right page or between the right versions;
  • Inability to reference the correct version of the content for a given request;
  • A confusion to index the original version.

What types of duplicate content should I avoid?

Certain types of duplicate content will have a more negative impact than others on your SEO.

Similar URLs

Google considers www, non-www, .com, .com / index, http or https URLs to be different when they point to the same page. They are therefore perceived as duplicate content. Likewise, click tracking parameters or analytical codes can generate duplicate content.
Example:

www.monsiteweb.com/blue-shirt?color=blue
www.monsiteweb.com/blue-shirt

Product sheets

Those who develop an e-commerce site certainly know this situation. If you use the product descriptions sent by your supplier, you may be penalized. You are certainly not the only one to get these products and therefore your content description will be found on many websites. Google will consider these pages to contain duplicate content.

Information from third party sites

If you want to integrate content from another site such as an excerpt, quote or comment, your content may be perceived to be duplicated even if you have inserted a link to the source. Google gives little credit to this type of content and could lower your quality score.

Printable versions

The content of your pages can be printable and cause duplicate content issues if multiple versions of a page are indexed.
Example:
www.unsiteweb.com/blue-shirt
www.unsiteweb.com/print/blue-shirt

Identifications during sessions

If each visitor is given a different session ID when they visit your website, it will be kept in the URL and will cause duplicate content issues.
Example:
www.unsiteweb.com/blue-shirt?SESSID=320
www.unsiteweb.com/blue-shirt

Filters and categories

Many e-commerce sites have filters and categories to organize user searches and therefore generate unique URLs. The problem is, even if these URLs are different and order the content differently, the content stays the same and therefore creates duplicate content.

Detection tools

There are a few tools to help you detect your duplicate content, here are 3 free:

Siteliner

This tool can detect content errors, including duplicate content.

siteliner duplicate content

OnCrawl

OnCrawl allows you to perform a full audit of the SEO performance of your website and in particular detects near or duplicated content, page clusters and your breakdown by keywords.

duplicate content detector

Copyscape

This tool detects duplicate content outside of your website.
duplicate content

How to avoid duplicate content?

There are a few methods in order to deal with this type of content. In many cases, different URLs need to be canonicalized through 301 redirects, the canonical tag, or using Google Webmaster Central settings.

Redirect 301

The 301 redirect helps search engines determine which version to index, that is, it allows you to choose the original version and link duplicate versions to it. Likewise, it is possible that several well-referenced pages are linked to the same page. In this case, the 301 redirect can create a stronger popularity signal and no longer put these pages in competition.

The canonical tag

In the same line as the 301 redirect, the canonical tag is nevertheless easier to set up and can be inserted to manage duplicate external content. It informs the search engines that your version is not the original and that you are aware of it. The link juice will therefore be attributed to the original.

The NoIndex, NoFollow

This tag allows not to index duplicate pages and therefore not to hit your SEO.

The default domain

This very simple setting allows you to assign a default domain to your site. Search results are therefore automatically displayed with the same type of URL as www in the results pages.

Unique descriptions

For e-commerce sites, try as much as possible to offer unique product sheets. Certainly this operation will take time, but in the long term, it will allow you to better position yourself in relation to the competitors who will have kept the default descriptions duplicated.