Tuesday, March 01, 2005

ContextAgnosticXmlHttpRequest: an Informal RFC

update: 08/22/2007: Per anonymous commenter, Mozilla is about to release cross-domain support for XmlHttpRequest.
update 3/17/2006: See also the JSONRequest proposal by Douglas Crockford.
update 10/3/2005: Microsoft is asking similar questions. Please read the full discussions we had on ContextAgnosticXmlHttpRequest on the what-wg mailing list. The thread goes-on for quite a while, with good thoughts from many developers.Some key points i've extracted:

- There was a pretty decent consensus for requiring an HTTP service to send one extra HTTP Header that we might call X-Allow-Foreign-Hosts. A ContextAgnosticXmlHttpRequest would fail to expose any data from a service not sending this header. This ought to protect intranets.

- A couple of us firmly believe we truly don't want to be sending any cached/saved Cookies and/or Basic Auth credentials.



XmlHttpRequest is 2004-2005's big buzz. This post is intended for people who are familiar with this technology. If you wish to familiarize yourself with it, you might consider Apple's fine introduction to XmlHttpRequest.

The current security model of XmlHttpRequest prevents a document from initiating an HTTP Request to a host different from the one that served it. There are very very good, useful, critically important reasons for this: Your browser sends cookies along with the request. For similar reasons, a Flash application may only talk back to the same host that served it. This model already allows developers to build a host of insanely great things, and I'm very happy with it.

This informal RFC is attempting to explore the Pros and Cons of an additional type of request: a ContextAgnosticXmlHttpRequest Object. It would not replace the current implementation of XmlHttpRequest. It would be another object we could leverage for other types of requirements:

Here's a basic use case:

A document served by somehost.myFIRSTdomain.com would retrieve XML data over HTTP from somehost.mySECONDdomain.com.

And, by extension:

A document served by host1.mydomain.com would retrieve XML data over HTTP from host2.mydomain.com.

This request would:
  1. allow a document to perform an HTTP request to a foreign host
  2. without sending any cookies in the request that would otherwise be in effect for that host: no Cookie: Header in request
  3. Discard Set-Cookie: directives in HTTP responses from the target host.
  4. Ignore any cached HTTP Basic Auth credentials, and only send those that'd somehow be explicitly set (URI http://username:password@host or setter methods?)
  5. Always send an accurate HTTP Referer: request header whose value is the URI of the document executing the request. See more below.
Reasons to not do this / Abuse / Security / Discussion Points:
  1. update 11/09/2005: see the above X-Allow-Foreign-Hosts header suggestion which ought to nicely alleviate this problem. In other words, your XML service can only be used by foreign documents if and only if you explicitly allow it by sending this extra HTTP header. - The issue: Everyone and their Moms could now paste some silly HTML code in their web pages to retrieve XML data from web-services-providing web sites. I could now paste the Google Suggest code into my own web page, and show results from Google's servers on my own web page, without going through the trouble of proxying them first. In turn, Google would have to start locking their application down. "Hijacking" HTTP XML services isn't new though, plenty of people have leveraged GMAIL's API in standalone applications without exactly asking Google for their permission. But the learning-and-adoption curve for abuse is far steeper today: You've gotta know a little more about software development than pasting a piece of HTML code into your blog template. "If you're going to leverage someone else's HTTP/XML API in your web documents, be a sport, share-in the bandwidth cost and at least proxy them". I'd most definitely file this under "not unreasonable!".
  2. This would make "the game" interesting for web sites that are RSS aggregators. They would now have the ability to let their end-users load their favorite sites' RSS XML data directly into the web browser without, technically ,"having to" offer a server-side aggregating/caching/proxying layer. If poorly coded, such aggregator site could drive-up bandwidth costs of most subscribed to blogs.
  3. There is one HTTP header i would really like a ContextAgnosticXmlHttpRequest to always send out with absolute integrity: The Referer: (sic) HTTP header. For those who are not already aware of this, the misspelling of the word "Referrer" as "Referer" is part of the official HTTP specification. This header would basically identify the URL of the document originating the request. The entity managing the service receiving the request should be able to easily build white lists and black lists to effectively restrict browser-based access to their service: "If you're not coming from www.google.com, i ain't serving you sh!t."
  4. Beyond the Cookie: HTTP Header, are there any other HTTP headers that should not be included?
  5. Could phishers abuse this to further obfuscate what goes-on in their web documents? They've already got plenty of tools in their shed. I've often seen phishers set-up multiple domains as "landing-pages" to pose as banking sites, while thoroughly obfuscating their HTML code to hide which host/CGI they're actually submitting the data to: disparate landing pages might submit to a same host/CGI. Even with obfuscated HTML, it's relatively easy to sniff out the form submission target by crawling the DOM via a browser plugin or javascript:document.forms. I'd hate to add yet another Remote-Scripting hack to their tool-shed.
  6. Should restrictions be imposed on HTTP methods? Say, only HTTP GET is allowed. At which point, we might rename the object to reflect this restriction. The idea is that HTTP POST is more likely to be used for transactional purposes, which is unlikely to happen at a cross-host/domain level, and I'm leaning toward not allowing the HTTP POSTing of large amounts of data to a completely different domain. See also phishers above.
  7. What about HTTPS? Currently, an HTTP document can't initiate an HTTPS XmlHttpRequest. Should we retain the same restrictions for this object?
  8. Should HTTP Basic Auth even be supported?

4 comments:

Chris Holland said...

Dimitri, thanks for stopping-by :) I don't think we should put all forms of "cross-site-scripting" in the same bag, and do think it's worth our while to look at specific use cases.

I do believe there is value to allow sites that don't otherwise inherently "trust each-other" to interoperate on some levels. FRAMEs, IFRAMEs allow us to do a lot of this today:

- you can stick an iframe that loads any web site on the Internet inside your web document. But unless both frames are served from the same host, no cross-frame scripting is allowed.

- xmlhttprequest is a different beast, but I'm still trying, through using either one of those two objects, to allow some form of interoperability under restricted conditions. In here, we're not talking about cross-frame scripting, but instead, two different sites to share data within one web document, the main caveat being that, within a ContextAgnosticXmlHttpRequest, no user data can ever be sent along with the request. This obviously precludes transactions between the two sites, but that's fine.

What i'm basically looking into is that I don't want the foreign site to trust my site, the foreign site just happens to be publishing some public, non-sensitive XML data over HTTP, which readers of my web page might find useful, and which i'd like to surface.

Anonymous said...

I am in favor of "NO"

Use XmlHttp to poll the server, let the server handle all external requests or webservices calls.

No more headaches about security.
Keep the client clean.

Anonymous said...

I note that this (or something like it) is on dthe verge of being implemented and released: https://bugzilla.mozilla.org/show_bug.cgi?id=389508

Chris Holland said...

Ah, thanks!