by Danny Thorpe
Summary: A shopper can walk into
virtually any store and make a purchase with nothing more than a plastic card
and photo ID. The shopper and the shopkeeper need not share the same currency,
nationality, or language. What they do share is a global communications system
and global banking network that allows the shopper to bring their bank services
with them wherever they go and provides infrastructure support to the
shopkeeper. What if the Internet could provide similar protections and services
for Web surfers and site keepers to share information? (9 printed pages)
Contents
IFrame URL Technique
Hiding Data in Bookmarks
Sender Identification
Sending to the Sender
Stateful Receiver
Application of Ideas
User Empowerment
Acknowledgments
Resources
Developing applications that live inside the Web browser is a lot
like window shopping on Main Street:
lots of stores to choose from, lots of wonderful things to look at in the
windows of each store, but you can't get to any of it. Your cruel stepmother,
Frau Browser, yanks your leash every time you lean too close to the glass. She
says it's for your own good, but you're beginning to wonder if your short leash
is more for her convenience than your safety.
Web browsers isolate pages living in different domains to prevent
them from peeking at each other's notes about the end user. In the early days
of the Internet, this isolation model was fine because few sites placed
significant application logic in the browser client, and even those that did
were only accessing data from their own server. Each Web server was its own
silo, containing only HTML links to content outside itself.
That's not the Internet today. The Internet experience has
evolved into aggregating data from multiple domains. This aggregation is driven
by user customization of sites as well as sites that add value by bringing
together combinations of diverse data sources. In this world, the Web browser's
domain isolation model becomes an enormous obstacle hindering client-side Web
application development. To avoid this obstacle, Web app designers have been
moving more and more application logic to their Web servers, sacrificing server
scalability just to get things done. Meanwhile, the end user's 2GHz, 2GB dumb
terminal sits idle.
If personal computers were built like a Web browser, you could
save your data to disk, but you couldn't use those files with any other application
on your machine, or anyone else's machine. If you decided to switch to a
different brand of photo editor, you wouldn't be able to edit any of your old
photos. If you complained to the makers of your old photo editor, they would
sniff and declare "We don't know what that other photo editor might do
with your data. Since we don't know or trust that other photo editor, then
neither should you! And no, we won't let you use 'your' photos with them,
because since we're providing the storage space for those photos, they're
really partly our photos."
You couldn't even find your files unless you knew first which
application you created them with. "Which photo editor did I use for
Stevie's birthday photos? I can't find them!"
And what happens when that tragically hip avant-garde photo
editor goes belly up, never to be seen again? It takes all your photos with it!
Sound familiar? It happens to all of us every day using Internet
Web sites and Web applications. Domain isolation prevents you from using your
music playlists to shop for similar tunes at an independent online store
(unrelated to your music player manufacturer) or at a kiosk within a retail
store.
Domain isolation also makes it very difficult to build
lightweight low-infrastructure Web applications that slice and dice data drawn
from diverse data servers within a corporate network. A foo.bar.com subdomain
on your internal bar.com corpnet is just as isolated from bar.com and
bee.bar.com as it is from external addresses like xyz.com.
Nevertheless, you don't want to just tear down all the walls and
pass around posies. The threats to data and personal security that the browser's
strict domain isolation policy protects against are real, and nasty. With
careful consideration and infrastructure, there can be a happy medium that
provides greater benefit to the user while still maintaining the necessary
security practices. Users should be in control of when, what, and how much of
their information is available to a given Web site. The objective here is not
free flow of information in all directions, but freedom for users to use their
data where and when it serves their purposes, regardless of where their data
resides.
What is needed is a way for the browser to support legitimate
cross-domain data access without compromising end user safety and control of
their data.
One major step in that direction is the developing standards
proposal organized by Ian Hickson to extend xmlHttpRequest to support
cross-domain connections using domain-based opt-in/opt-out by the server being
requested. (See Resources.) If this
survives peer review and if it is implemented by the major browsers, it offers
hope of diminishing the cross-domain barrier for legitimate uses, while still
protecting against illegitimate uses. Realistically, though, it will be years
before this proposal is implemented by the major browsers and ubiquitous in the
field.
What can be done now? There are patterns of behavior supported by
all the browsers which allow JavaScript code living in one browser domain
context to observe changes made by JavaScript living in another domain context
within the same browser instance. For example, changes made to the width or
height property of an iframe are observable inside as well as outside the
iframe. Another example is the iframe.src property. Code outside an iframe
cannot read the iframe's src URL property, but it can write to the iframe's src
URL. Thus, code outside the iframe can send data into the iframe via the iframe's
URL.
This URL technique has been used by Web designers since iframes
were first introduced into HTML, but uses are typically primitive,
purpose-built, and hastily thrown together. What's worse, passing data through
the iframe src URL can create an exploit vector, allowing malicious code to
corrupt your Web application state by throwing garbage at your iframe. Any code
in any context in the browser can write to the iframe's .src property, and the
receiving iframe has no idea where the URL data came from. In most situations,
data of unknown origin should never be trusted.
This article will explore the issues and solution techniques of
the secure client-side cross-domain data channel developed by the Windows Live
Developer Platform group.
IFrame URL Technique
An iframe is an HTML element that encapsulates and displays an
entire HTML document inside itself, allowing you to display one HTML document
inside another. We'll call the iframe's parent the outer page or host page, and
the iframe's content the inner page. The iframe's inside page is specified by
assigning a URL to the iframe's src property.
When the iframe's source URL has the same domain name as the
outer, host page, JavaScript in the host page can navigate through the iframe's
interior DOM and see all of its contents. Conversely, the iframe can navigate
up through its parent chain and see all of its DOM siblings in the host page
and their properties. However, when the iframe's source URL has a domain
different from the host page, the host cannot see the iframe's contents, and
the iframe cannot see the host page's contents.
Even though the host cannot read the iframe element's src
property, it can still write to it. The host page doesn't know what the iframe
is currently displaying, but it can force the iframe to display something else.
Each time a new URL is assigned to the iframe's src property, the
iframe will go through all the normal steps of loading a page, including firing
the onLoad event.
We now have all the pieces required to pass data from the host to
the iframe on the URL. (See Figure 1.) The host page in domain foo.com can
place a URL-encoded data packet on the end of an existing document URL in the
bar.com domain. The data can be carried in the URL as a query parameter using
the ? character (http://bar.com/receiver.html?datadatadata) or as a bookmark
using the # character (http://bar.com/receiver.html#datadatadata). There's a
big difference between these two URL types which we'll explore in a moment.
Figure 1. iframe URL data passing
The host page assigns this URL to the iframe's src property. The
iframe loads the page and fires the page's onLoad event handler. The iframe
page's onLoad event handler can look at its own URL, find the embedded data
packet, and decode it to decide what to do next.
That's the iframe URL data passing technique at its simplest. The
host builds a URL string from a known document url + data payload, assigns it
to the src property of the iframe, the iframe "wakes up" in the
onLoad event handler and receives the data payload. What more could you ask
for?
A lot more, actually. There are many caveats with this simple
technique:
·
No acknowledgement of receipt—The host page has no idea if the iframe
successfully received the data.
·
Message overwrites—The host doesn't know when the iframe has finished
processing the previous message, so it doesn't know when it's safe to send the
next message.
·
Capacity limits—A URL can be only so long, and the length limit
varies by browser family. Firefox supports URLs as long as 40k or so, but IE
sets the limit at less than 4k. Anything longer than that will be truncated or
ignored.
·
Data has unknown origin—The iframe has no idea who put the data into its
URL. The data might be from our friendly foo.com host page, or it might be
evil.com lobbing spitballs at bar.com hoping something will stick or blow up.
·
No replies—There's no way for script in the iframe to pass
data back to the host page.
·
Loss of context—Because the page is reloaded with every message,
the iframe inner page cannot maintain global state across messages
Hiding Data in Bookmarks
Should we use ? or # to tack data onto the end of the iframe URL?
Though innocuous enough on the surface, there are actually a few significant
differences in how the browsers handle URLs with query params versus URLs with
bookmarks. Two URLs with the same base path but different query params are
treated as different URLs. They will appear separately in the browser history
list, will be separate entries in the browser page cache, and will generate
separate network requests across the wire.
URL bookmarks were designed to refer to specially marked anchor
tags within a page. The browser considers two URLs with the same base path but
with different bookmark text after the # char to be the same URL as far as
browser history and caches are concerned. The different bookmarks are just
pointing to different parts of the same page (URL), but it's the same page
nonetheless.
The URLs http://bar.com/page.html#one,
http://bar.com/page.html#two, and http://bar.com/page.html#three are considered
by the browser to be cache-equivalent to http://bar.com/page.html. If we used
query params, the browser would see three different URLs and three different
trips across the network wire. Using bookmarks, however, we have at most one
trip across the network wire; subsequent requests will be filled from the local
browser cache. (See Figure 2.)
Figure 2. Cache equivalence of bookmark URLs (Click on the
picture for a larger image)
For cases where we need to send a lot of messages across the
iframe URL using the same base URL, bookmarks are perfect. The data payloads in
the bookmark portion of the URL will not appear in the browser history or
browser page cache. What's more, the data payloads will never cross the network
wire after the initial page load is cached!
The data passed between the host page and the iframe cannot be
viewed by any other DOM elements on the host page because the iframe is in a
different domain context from the host page. The data doesn't appear in the
browser cache, and the data doesn't cross the network wire, so it's fair to say
that the data packets are observable only by the receiving iframe or other
pages served from the bar.com domain.
Sender Identification
Perhaps the biggest security problem with the simple iframe URL
data-passing technique is not knowing with confidence where the data came from.
Embedding the name of the sender or some form of application ID is no solution,
as those can be easily copied by impersonators. What is needed is a way for a
message to implicitly identify the sender in such a way that could not be
easily copied.
The first solution that pops to mind for most people is to use
some form of encryption using keys that only the sender and receiver possess.
This would certainly do the job, but it's a rather heavy-handed solution,
particularly when JavaScript is involved.
There is another way, which takes advantage of the critical
importance of domain name identity in the browser environment. If I can send a
secret message to you using your domain name, and I later receive that secret
as part of a data packet, I can reasonably deduce that the data packet came
from your domain.
The only way for the secret to come from a third-party domain is
if your domain has been compromised, the user's browser has been compromised,
or my DNS has been compromised. All bets are off if your domain or your browser
have been compromised. If DNS poisoning is a real concern, you can use https to
validate that the server answering requests for a given domain name is in fact
the legitimate server.
If the sender gives a secret to the receiver, and the receiver
gives a secret to the sender, and both secrets are carried in every data packet
sent across the iframe URL data channel, then both parties can have confidence
in the origin of every message. Spitballs thrown in by evil.com can be easily
recognized and discarded. This exchange of secrets is inspired by the SSL/https
three-phase handshake.
These secrets do not need to be complex or encrypted, since the
data packets sent through the iframe URL data channel are not visible to any
third party. Random numbers are sufficient as secrets, with one caveat: The
JavaScript random-number generator (Math.random()) is not cryptographically
strong, so it is a risk for producing predictable number sequences. Firefox
provides a cryptographically strong random-number generator (crypto.random()),
but IE does not. As a result, in our implementation we opted to generate strong
random numbers on the Web server and send them down to the client as needed.
Sending to the Sender
Most of the problems associated with the iframe URL data passing
technique boil down to reply generation. Acknowledging packets requires the
receiver to send a reply to the sender. Exchanging secrets requires replies in
both directions. Message throttling and breaking large data payloads into
multiple smaller messages require receipt acknowledgement.
Figure 3. Message in a Klein Bottle (Click on the picture for a
larger image)
So, how can the iframe communicate back up to the host page? Not
by going up, but by going down. The iframe can't assign to anything in its
parent because the iframe and the parent reside in different domain contexts.
But the bar.com iframe (A) can contain another iframe (B) and A can assign to B's
src property a URL in the domain of the host page (foo.com). foo.com host page
contains bar.com iframe (A) contains foo.com iframe (B).
Great, but what can that inner iframe do? It can't do much with
its parent, the bar.com iframe. But go one more level up and you hit pay dirt:
B's parent's parent is the host page in foo.com. B's page is in foo.com,
B.parent.parent is in foo.com, so B can access everything in the host page and
call JavaScript functions in the host page's context.
The host page can pass data to iframe A by writing a URL to A's
src property. A can process the data, and send an acknowledgement to the host
by writing a URL to B's src property. B wakes up in its onLoad event and passes
the message up to its parent's parent, the host page. Voilà.
Round-trip acknowledgement from a series of one-way pipes connected together in
a manner that would probably amuse Felix Klein, mathematician and bottle
washer.
Stateful Receiver
To maintain global state in the bar.com context across multiple
messages sent to the iframe, use two iframes with bar.com pages. Use one of the
iframes as a stateless message receiver, reloading and losing its state with
every message received. Place the stateful application logic for the bar.com
side of the house in the other iframe. Reduce the messenger iframe page logic
to the bare minimum required to pass the received data to the stateful bar.com
iframe.
An iframe cannot enumerate the children of its parent to find
other bar.com siblings, but it can look up a sibling iframe using
window.parent.frames[] if it knows the name of the sibling iframe. Each time it
reloads to receive new data on the URL, the messenger iframe can look up its
stateful bar.com sibling iframe using window.parent.frames[] and call a
function on the stateful iframe to pass the new message data into the stateful
iframe. Thus, the bar.com domain context in browser memory can accumulate
message chunks across multiple messages to reconstruct a data payload larger
than the browser's maximum URL length.
Application of Ideas
The Windows Live Developer Platform team has developed these
ideas into a JavaScript "channel" library. These cross-domain channels
are used in the implementation of the Windows Live Contacts and Windows Live
Spaces Web controls (http://dev.live.com), intended to reside on third party
Web pages but execute in a secure iframe in the live.com domain context. The
controls provide third party sites with user-controlled access to their Windows
Live data such as the user's contacts list or Spaces photo albums. The channel
object supports sending arbitrarily large data across iframe domain boundaries
with receipt acknowledgement, message throttling, message chunking, and sender
identification all taking place under the hood.
Our goal is to groom this channel code into a reusable library,
available to internal Microsoft partners as well as third party Web developers.
While the code is running well in its current contexts, we still have some work
to do in the area of self-diagnostics and troubleshooting; when you get the
channel endpoints configured correctly, it works great, but it can be a real
nightmare to figure out what isn't quite right when you're trying to get it set
up the first time. The main obstacle is the browser itself—trying
to see what's (not) happening in different domain contexts is a bit of a
challenge when the browser won't show you what's on the other side of the wall.
User Empowerment
Hardly 40 years ago, a shopper on Main Street USA had to go to
considerable effort to convince a shopkeeper to accept payment. If you didn't
have cash (and lots of it), you were most likely out of luck. If you had
foreign currency, you'd need to find a big bank in a big city to exchange for
local currency. Checks from out of town were rarely accepted, and store credit
was offered only to local residents.
Today, shoppers and shopkeepers share a global communications
system and global banking network that allows the shopper to bring their bank
services with them wherever they go, and helps the shopkeeper make sales they
otherwise might miss. The banking network also provides infrastructure support
to the shopkeeper, helping with currency conversion, shielding from credit risk
and reducing losses due to fraud.
Now, why can't the Internet provide similar protections and
empowerments for the wandering Web surfer, and infrastructure services for Web
site keepers? Bring your data and experience with you as you move from site to
site (the way a charge card brings your banking services with you as you shop),
releasing information to the site keepers only at your discretion. The Internet
is going to get there; it's just a matter of how well and how soon.
Acknowledgments
Kudos to Scott Isaacs for the original iframe URL data-passing
concept. Many thanks to Yaron Goland and Bill Zissimopoulos for their
considerable contributions to the early implementations and debugging of the
channel code, and to Gabriel Corverra and Koji Kato for their work in the more
recent iterations. "It's absolute insanity, but it just
might work!"
Resources
XMLHttpRequest 2, Ian Hickson
http://www.mail-archive.com/public-webapi@w3.org/msg00341.html
http://lists.w3.org/Archives/Public/public-webapi/2006Jun/0012.html
Anne van Kesteren's Weblog
http://annevankesteren.nl/2007/02/xxx
About the author
Danny Thorpe is a developer on the
Windows Live Developer Platform team. His badge says "Principal SDE,"
but he prefers "Windows Live Quantum Mechanic," as he spends much of
his time coaxing stubborn little bits to migrate across impenetrable barriers.
In past lives, he worked on "undisclosed browser technology" at
Google, and, before that, he was a Chief Scientist at Borland and Chief
Architect of the Delphi compiler. At Borland, he had the
good fortune to work under the mentorship of Anders Hejlsberg, Chuck
Jazdzewski, Eli Boling, and many other Borland legends. Prior to joining
Borland, he was too young to remember much. Check out his blog at
http://blogs.msdn.com/dthorpe.
This article was published in the Architecture Journal, a print
and online publication produced by Microsoft. For more articles from this
publication, please visit the Architecture Journal Web site.