Tuesday, October 25, 2005

Protecting Documents from Google Accelerator

In light of the heated controversy surrounding Google Web Accelerator I'm toying around with a way to produce "links" that would be immune to those technologies.

The main issue is that most user-agent implementations tie a user interface paradigm, an "anchor", to an HTTP method, "GET".

Through scripting, there are quite a few ways to make an anchor tag more immune to accelerators, and automated crawlers. javascript: url as href attribute value, "#" as href attribute value, and onclick attribute submitting a form, etc.

In an attempt to explore alternatives to scripting, I've started toying around with the "button" HTML element. So far, I've found that Mac MS IE 5 doesn't appear to support it. Everything else is looking reasonably happy.

Here's what i'm looking at so-far.

It seems to work in: Opera, Gecko, Safari, Treo650/Blazer, Windows IE
It does not seem to work in: Mac MSIE5, SideKick (thanks Kevin).

- Can anyone try more handheld devices?
- One might add a wee scripting to set window.status.
- Removing various CSS directives from that example gets you closer to the original "button" construct, as rendered by default by the user agent. Good to play with.
- You no-longer benefit from a browser's "default way of rendering a link".
- I need to test this with images. - done: it works :)
- Notice what the browser does when your mouse is "down": it lowers the text. Not sure how to override the initial "position" with CSS.

See also:

JAH, by Kevin Marks.


Andrew Green said...

This is really nice work! It's hard to use POST exclusively for "non-idempotent" actions when it's impossible to use a regular link. Sadly, <a href="..." method="POST"> doesn't exist. Your work here seems to be a very useful way of achieving basically the same result.

Dimitri Glazkov said...

IMHO, this is an interesting exploration of making a button look like link.

There could be two reasons you wouldn't want Google to follow a link:

1) The hyperlink points to a resource somewhere you don't want the user to go (like a potential spam link), in which case you should use rel="nofollow"

2) You are doing something evil (non-idempotent) with the hyperlink and it shouldn't be the hyperlink in the first place. That's where your handy method comes in.

However, this method should not be abused carte blanche to just prevent Google from crawling.

Chris Holland said...

Andrew: cool, thanks for stopping-by. yeah i'm shooting for a near 1-to-1 replacement in its most basic state. when used in more advanced ways the form/button combo could allow us to do more advanced "things". Each "button" can have a different name/value pair, when of type submit, to signify different actions sent to the same POST URI. Which is something we can also already do with forms, but this time around, things can look like links.

Chris Holland said...


The main issue this tries to address is the google web accelerator, and any other similar technology that might come down the road, while your points do remain valid.

With GWA, say you're logged-in into some admin interface, or a shopping cart, with anchor hrefs all over the place to "remove this item" or "logout" etc etc.

GWA will "pre-fretch" those *for you*, which is the root of the uproar i linked to in that post, and has the Rails Dudes up in arms. Most stuff i've worked on doesn't typically do destructive/transactional stuff in a straight HTTP GET/link, but it's an issue
for many developers.

While a few js alternatives are out, i'm just offering yet another alternative, script-free, that strives to use pure xhtml constructs :)

Anonymous said...
This comment has been removed by a blog administrator.