One of the more technical aspects of SEO is to understand, monitor and manage status codes.
What are Status Codes?
Status Codes are an essential part of HTTP, the request-response protocol that powers the Internet. Each time someone visits a page (including Googlebot) they ask the site for information. The status code is a numeric response to that request and provides guidance on how to proceed. You might be familiar with status codes such as 404 and 301.
SEO Status Codes
I recommend bookmarking the status code definitions documented by the W3C. However, I want to provide a quick reference guide specifically for SEO.
OK or Success. This is the response code you want to see most often. At a minimum, I want Googlebot to see a 200 response code in 90% or more instances during a crawl.
Moved permanently. This is the right way to redirect, telling search engines to index that content in a new location.
Moved temporarily. This is the wrong way to redirect (except in very rare cases). You’re essentially putting this content into limbo because it’s not at the current location but search engines won’t index the temporary location.
Not modified. This can be used for crawl efficiency, telling search engines that the content has not changed. You’re basically telling Googlebot not bother and to move on to other content. The advent of Caffeine may have made this unnecessary but I think it’s still worthwhile.
Not found. This happens when the client can’t find the content at that specific location. Too many 404s are bad. In my experience having too many is a negative algorithmic signal. Google simply doesn’t trust that sending a user to that site will be a positive experience.
I don’t have a hard and fast number for when 404s become problematic. I believe it’s probably based on a percentage of total requests to that site. As such, it’s just good practice to reduce the number of 404s.
That does not mean zero! I don’t recommend putting a 301 in place when it should return a 404. A request for domain.com/foo should return a 404. Ditto for returning a 200 when it should be a 404. (Yes, I’ve seen this lately.) I’d be surprised if having no 404s wasn’t also some sort of red flag.
Gone. If you know that content no longer exists, just say so. Don’t encourage Googlebot to come back again and again and again via a 404 which doesn’t tell it why that page no longer exists.
Internal Server Error. This generally means that the client never received an appropriate response from the site. 500 errors basically tell the search engine that the site isn’t available. Too many 500 errors call into question the reliability of that site. Google doesn’t want to send users to a site that ultimately times out and doesn’t load.
How to Track Status Codes
There are a number of ways you can track status codes. For spot checking purposes, I recommend installing one of two Firefox add-ons: HttpFox or Live HTTP Headers. These add-ons let you look at the communication between user agent and client. For example, what happens when I type ‘www.searchengineland.com’ directly into my browser bar.
Using HttpFox I see that it performs a 301 redirect to the non-www version and then resolves successfully. Google Webmaster Tools also provides you with nice insight through the Crawl Errors reporting interface.
But if you really want to use status codes to your benefit you’ll need to count and track them every day via log file analysis. I recommend creating a daily output that provides the count of status codes encountered by Googlebot and Bingbot.
Status Code Reports
Using those daily numbers you can construct insightful and actionable dashboard graphs.
While this may take some doing, the investment is worthwhile. You can quickly identify and resolve 404s and 500s. Many will find it helpful to have this data (concrete numbers!) so you can prioritize issues within a larger organization.
You’ll also gain insight into how long it takes search engines to ‘digest’ a 301 and much more. Status code management can be a valuable part of an advanced SEO program.