Too Many URLs Spoil the SEO: Fixing a Common Ecommerce Duplicate Content Problem
A common SEO problem for ecommerce sites is CMS (content management systems) that create different URLs for a product that lives under multiple categories. The main reason this is bad for SEO is search engines only allocate so much bandwidth to crawling your site. If most or all of your product pages have duplicates, you're less likely to get your site fully crawled and indexed -- meaning lost organic search opportunity.
The above shows 6 copies of the product page for Abercrombie's Clarissa skirt are currently indexed in Google. Half of the links lead to a 404 page, the rest look like this:
The best practice is to use a global alias for the product page URL for the UI to look up and render instead of the category-specific URL. If you wish to maintain breadcrumb trails, use a session ID or cookie to track which categories the customer clicked to locate the product. If the visitor lands on the page without browsing through a category menu (search engine referral, affiliate link, PPC ad, email, site search etc), default to a parent category.
Keep in mind that session IDs can get crawled and indexed by search engines, creating even worse duplicate content issues (and security issues with some ecommerce platforms). To block search engines from crawling URLs with session IDs, use this syntax in your robots.txt file:
Other benefits of reducing duplicate content is it prevents Page Rank dilution and may simplify your web analytics (product page views and conversions aren't spread over multiple URLs).