Cache in Hand

Google’s cache view is a valuable window into how googlebot “sees” a site. I find myself stalking the cache every couple of days in an effort to untangle architectural challenges to my SEO objectives. When I’m having difficulty helping colleagues understand why content or links are or aren’t crawlable, I often take them to the cache view as a quick and easy visual. Once they see what the bots see, from the googlebot itself, the conversation around how to resolve the issue is usually much easier. I wanted to include it in my article on advanced search operators at Pratical eCommerce last week, but I hit the word count cap. So here’s the scoop on cache.

A site’s architecture and the technology choices its development team choose can make or break the bots’ ability to crawl a site. The cache view offers a quick window into the bots’-eye view. For example, most humans surfing with a modern browser that incorporates JavaScript and accepts cookies will see Dell‘s homepage like this:

Dell.com HomepageDell Homepage As a human I am able to use the drop down menus to navigate to the main areas of the site, quickly consume many of Dell’s priority messages from the static feature boxes and the flash carousel, and browse the basic HTML links toward the bottom of the page. Dell makes its marketing priorities very clear and easy to understand… for humans with modern browsers. But what about the bots? What content can they consume? Let’s take a look at the cache view [cache:www.dell.com]:

With the cache view the page looks remarkably similar. There’s a gray header at the top of the page indicating that Google last cached this page on Oct 4, 2010 18:22:07 GMT, one hour and one minute ago at the time of this article. So any changes that Dell made to the site in the last 61 minutes will not be reflected in this cache view. That’s a very important note when you’re trying to confirm the crawlability of some new architectural change — make sure the change has been cached before you start analyzing the cache view.

Second thing to consider is that the cache view shows a far more human-centric view of the page than I’d expect. That’s because the initial cache view is still using your modern browser to execute the Javascript, CSS and cookies that the cached page calls. To see the bots’-eye view more realistically, we need to disable those by clicking on the “Text-only version” link in the upper right corner in the gray box. Now we see:

Now we’re seeing the textual version of the site, stripped of its technical finery. The rollover navigation in the header no longer functions. The links to main categories are still crawlable as plain text links, but the homepage doesn’t pass link popularity down to the subcategory pages. Depending on the marketing value of those pages, the lack of link juice flowing there could be an issue. The next thing we see is that the big lovely flash carousel, so front-and-center for human consumption, doesn’t exist without JavaScript enabled. Assuming the pages displayed in the flash piece are valuable landing pages, which they likely are to warrant homepage coverage and development time, this again is a missed opportunity to flow link juice to important pages. Both of these issues, the navigation and the flash carousel, could be coded to degrade gracefully using CSS to provide the same crawlable text and links to bots as well as humans.

Just to be safe any issue I see in cache view (or any issue that I don’t see that I expect to see) I double check as well by manually disabling my JavaScript, CSS and cookies; and I also set my user agent to googlebot. For more detailed information on the FireFox plugins I use to do this, see Surfing Like a Search Engine Spider on Practical Ecom. Cache view is a quick way to investigate to decide if a deeper analysis is required.

Note: The cache: operator only works on Google, but Yahoo Site Explorer offers a cache link on each page in its report as well. Bing does not support the cache: operator.


Web PieRat logo.

Originally posted on Web PieRat.

Redesign with SEO in Mind

A site redesign or switch to a new platform is kind of like a rebirth – it’s one of the most exciting and nerve-wracking times for the entire Internet marketing team. With everyone caught up in the branding, design, usability and technology, the impact on SEO can sometimes be forgotten until the last minute.

I wrote this article on redesigning a site with SEO in mind back in July for MultichannelMerchant.com and gave up looking for it to be published… so I missed its publish date in September. Maybe you did too. Here’s a redux of the original article.

While it’s difficult to determine what the natural search impact will be until working code hits a development server, keeping several mantras in mind and repeating them liberally will keep the team focused on the most critical elements to plan for SEO success. I love these mantras — I actually say them to myself as I’m auditing sites.

SEO Development Mantras

  1. Links must be crawlable with JavaScript, CSS and cookies disabled.
  2. Plain text must be indexable on the page with JavaScript & CSS disabled.
  3. Every page must send a unique keyword signal.
  4. One URL for one page of content.
  5. We’re going to 301 that, right?

When a site is stable on the development environment and the URLs are ironed out, identify a 301 redirect plan, build the new XML sitemap and make sure you have a measurement plan in place to measure the impact of the relaunch.

All this in more detail at the original article at Multichannel Merchant: https://multichannelmerchant.com/ecommerce/0901-using-seo-redesign/index.html


Web PieRat logo.

Originally posted on Web PieRat.