When is an error not really an error?
The other day Google Webmaster Tools informed one of my clients that it had found over 1,000 404 errors. Numerous Google folks (including Maile Ohye) have told me that an excess of 404s will adversely impact SEO.
Supporting this thesis and to help in tracking down these renegade links, Google has relatively new functionality that tells you what pages a 404 was linked from. Thank you Google.
Very quickly I realized that all of the 'linked from' pages were cached pages. Pages with a discovery date of between four to six months ago. Internal pages. Pages that have since changed. In fact, pages that no longer have a link to the dead page.
No thank you Google.
Clearly it would have been nice if this client had 301 redirected all of these URLs. But when doing a major architecture change you're often going to orphan a number of URLs. It happens. And if you've retired the links internally, and no external links existed, the pages essentially disappear.
Unless you're crawling an out-of-date copy of the page.
Of course you can request a URL removal via Google Webmaster Tools. But am I really going to do this for 1,000 pages? It's painful even if I can narrow it down using a directory or subdirectory.
Instead I can implement 301 redirects for the offending URLs. All for the sole purpose of ensuring that a cached crawl of internal pages doesn't trip a 404.
Both of these options seem unnecessary.
If Google finds a 404 in a cached page why wouldn't they seek out the original to verify that the problem currently exists? It seems like an easy business rule to implement and would likely reduce the volume of URL removal requests.
Is it that easy or am I missing something?
The Next Post: Could Inconsistent Design Save Social Advertising?
The Previous Post: SEO Affirmative Action