Skip to main content

skip to main content

developerWorks  >  XML | Web development  >

Shaping the future of secure Ajax mashups

How to improve the browser for hybrid Web applications

developerWorks
Document options

Document options requiring JavaScript are not displayed

Discuss


Hey there! developerWorks is using Twitter

Follow us


Rate this page

Help us improve this content


Level: Intermediate

Brent Ashley (brent@ashleyit.com), President, Ashley IT Services, Inc.

03 Apr 2007

Current Web browsers weren't designed to easily and securely get content from multiple sources into one page. Discover how developers have stretched the available tools to fit the task and how doing so has put strain on the resulting applications with respect to security and scalability. Also, learn about several browser improvements being proposed to remedy the situation and how to become part of the conversation that will bring Web development beyond this hurdle to a new level of interoperability.

Mashing it up with Ajax

A mashup is a Web application that integrates content from more than one source and delivers it for presentation in a single page. The server makes requests to each content source, parses the information it receives, and combines the results into a page to send to the browser, as Figure 1 shows.


Figure 1. Mashing up content from multiple sources
Mashup diagram

An Asynchronous JavaScript + XML (Ajax) application allows a Web page to get content from the server and update itself in place asynchronously using JavaScript™ code, as shown in Figure 2. In this way, users can interact with a rich user interface (UI) without reloading the full page. The server sends an initial page to the browser, which makes calls back to the server for updated content. The asynchronous JavaScript code calls frequently use XML to encode the data; however, other data formats are common, such as JavaScript Object Notation (JSON), HTML, and delimited text.


Figure 2. Interactive data display with Ajax
Ajax diagram

An Ajax mashup is a hybrid Web application. It uses Ajax techniques to present a rich UI that updates itself in place using content that it retrieves asynchronously from multiple sources. The server sends an initial page to the browser, which then makes calls to retrieve updated content. These calls can be made directly to the third-party sources from the browser or back to the originating server, which acts as a proxy for the third-party content.



Back to top


Round pegs, square holes

When the elements comprising the current browser environments were designed, Ajax mashups were not on anybody's radar. Nothing was built into the browsers, into the Hypertext Transfer Protocol (HTTP), or into HTML or XHTML that was specifically designed to accommodate the browser's asynchronous retrieval of content from multiple sources in a secure and robust manner. Some features in the World Wide Web Consortium (W3C) HTTP specifications that might have been used for mashups, such as Document Object Model (DOM) Level 3 Load and Save Specification, were either not fully implemented or not implemented at all by a majority of browsers.

Dynamic HTML (DHTML) was not initially used in combination with dynamically retrieved content. Both the presentation and data elements of a dynamic Web page were delivered together along with scripts to manipulate them. The scripts would display, hide, move, create, and destroy document objects to create dynamic effects, but any time more data was needed from the server, the page would be replaced with a new one. Data flow was synchronous with page reload.

Consequently, developers who wished to build the kind of hybrid Web application that we now call a mashup had to take the available technology and find ways to stretch it to fit their needs. Two approaches were taken to allow the browser to retrieve content without reloading the page: embedding an external transport mechanism and using browser-native objects to perform transport duties.

Outside help

An early solution was Microsoft's Remote Scripting, which used a Java™ applet that exchanged XML-formatted messages with server-side components. This approach quickly became unwieldy because of vendor squabbles as well as Java Virtual Machine (JVM) and security model differences.

Microsoft later built the XMLHttpRequest (XHR) object, whose designers had the expectation that it would be used only with Microsoft® Outlook® Web Access (OWA). The object was initially available only to Windows® Internet Explorer® users and subsequently not widely used until years later, when Mozilla and Safari adopted it. Originally an external Microsoft ActiveX® object, current implementations are native objects within the browser. Despite its name, the XHR object can transfer data in any format and is not limited to valid XML.

Many developers use the XML communication features of Macromedia Flash to build embeddable components to communicate with the server. The XMLSocket and flex.net.socket objects provide abilities similar to XHR but with additional communication control and XML parsing capability.

An inside job

Because of the problems and dependencies associated with external transport mechanisms, the internet development community has collaborated in the discovery and development of several browser-native remote calling methods.

  • Using a hidden iframe element to load external content: The iframe is then accessed through the DOM to extract the content from the document it has loaded. You can specify any parameters in the URL querystring or dynamically create a form that posts to the service with the iframe as a target. This method is compatible across a wide range of both current and older browsers.
  • Using an img element to send requests for content: The server performs its task using the parameters from the URL's querystring, and then returns encoded content in a cookie. This method is limited in the amount of data that can be easily communicated, because both querystring and cookies are limited in size.
  • Dynamically creating a script element in the DOM of the current page: Upon loading, code that the server supplies is immediately executed. The server uses parameters from the URL querystring.

See Resources for links to detailed information on these tools and techniques.



Back to top


Breaking the confines

Most of the techniques available for retrieving content asynchronously inherit their security from the JavaScript security model, which allows scripts to interact only with elements that originate from the same server as the page to which the script belongs. This is the Same Origin Policy, which all browsers implement.

To have a Web page retrieve content from third-party sources, you must circumvent the Same Origin Policy. The commonly used exception that is not restricted by the Same Origin Policy is the <script> tag technique, whereby you append a <script> element to a page's DOM, causing it to load and run the code it finds at the URL that the element's src attribute specifies.

Using the <script> tag to run scripts from multiple sites presupposes a high level of trust between all sites involved, because all such scripts run in the same execution and security context and therefore can possibly gain access to information and cookies from the other sites.



Back to top


Security or scalability: You can't have both

The workarounds currently in wide use to enable Ajax mashups each come at some cost. When stretching a browser's designed limits, you affect other aspects of the application's overall operation. Doing so typically causes an application to become either less secure or less scalable.

It's secure, but is it scalable?

When restricted by the browser's Same Origin Policy, the same server that hosts the application must take on the task of fetching the third-party content and sending it to the client. The server acts as a client to the third-party service in addition to its usual server function.

Using the server as a proxy for every client transaction means that a large number of users could cause undue server load. Applications using this technique would need to be designed to be scalable on the server side, using multiple coordinated servers to handle the request load.

It's scalable, but is it vulnerable?

Use of the <script> tag to circumvent the Same Origin Policy allows the client to retrieve content from third parties. This functionality eliminates the server bottleneck to scalability, because each additional client takes on its own content-gathering role.

The scalability benefit of the <script> tag comes at the cost of sidestepping the Same Origin Policy security model, introducing potential attack vulnerabilities:

  • Cross-site cookie access becomes possible: Scripts from one site can access cookies from another site.
  • There is no opportunity to inspect the retrieved code for safety issues before running it: The code runs immediately upon loading.


Back to top


Potential solutions

Clearly, the tools that browsers currently provide for mashups are insufficient to allow you to build applications that are both scalable and secure. Developers must find solutions that work both now and in the long term.

Here and now

A more recently developed content-retrieval technique employs communication between a page's script and a hidden iframe through its src URL's fragment identifier (the part of the URL that comes after the # sign). Scripts in the parent page and embedded iframe can set each other's fragment identifiers despite coming from different origins. An agreed-upon communication protocol is maintained between the scripts, driven by JavaScript timers that periodically fire routines to check for changes in the fragment identifier.

Because the scripts must know each other's addresses and they must collaborate between themselves to agree on a protocol, trust is ensured. Because any server interaction is local to each component and separate from the inter-script communication, cookies are not exposed.

While still imperfect (for example, it relies on an anomaly that is not a designed behavior, and polling for changes is inferior to having an event fire in response to a change), this solution comes closer to providing browser-native, secure, in-page, cross-domain communication than any other.

Note: James Burke, a developer at AOL Developer Network, pioneered the fragment identifier technique and has built it into the latest releases of the Dojo Toolkit JavaScript library.

Long term

Browser manufacturers and the development community are currently discussing several potential ways to modify elements of the browser environment to make it purpose-built for Ajax mashups. The Web Hypertext Application Technology Working Group (WHATWG) has a proposal in section 7.3 of its Web Applications 1.0 Working Draft for a mechanism called Cross Document Messaging. The Opera browser already implements this feature. It specifies a method of collaborative communication between DOM objects from different domains that allow the receiver of a message to choose which messages to respond to based on their origin.

Ian Hickson (who was at Opera and is now at Google) has proposed cross-site extensions to the existing XMLHttpRequest object. His proposal consists of several modifications to the way requests are made, including restrictions on header control and an access-control mechanism.

Douglas Crockford, JavaScript evangelist and architect at Yahoo!, is one of the world's most knowledgeable experts on the JavaScript language. You can find many of his presentations and articles explaining advanced JavaScript techniques on his personal Web site and through the Yahoo! Developer Network. Another initiative that Crockford promotes is JSON, a data-interchange format that is widely used in Ajax applications, primarily because it is readily parsable by JavaScript and less verbose than XML. Crockford has written two proposals to build mashup-aware elements into browsers.

Outstanding proposals

Several outstanding proposals are available to help address this quandary:

  • JSONRequest proposal: Browsers implement a new object that acts much like the existing XMLHttp object with several modifications:
    • JSONRequest would be exempt from the Same Origin Policy.
    • A minimal set of HTTP headers would be used, reducing the overall size of requests.
    • No cookies would be transferred, ensuring that cross-site cookie issues are avoided.
    • JSONRequest would accept only valid JSON text, which would ensure that raw executable code could not be sent for execution.
    • After a communication failure, random delays are introduced before retry to frustrate certain classes of attacks.
    • Each request would return a sequence identifier, allowing asynchronous responses to be associated easily with their original requests.
    • Specific support for duplex connections would enable the server to asynchronously initiate communications through an open communications channel.
  • <module> tag proposal: A new HTML tag partitions a page into a collection of modules that are secure from each other but can communicate safely:
    • The <module> tag would be able to access third-party resources, exempt from Same Origin Policy.
    • Cooperative communication between page and module would be available only through specific interfaces. Modules would not be able to communicate with each other -- only with the page. A page can choose to facilitate communication between modules.
    • Communication would be restricted to valid JSON text, in contrast to communicating JavaScript objects, which could possibly cause security leakage through attached code.
    • Restrictions are proposed to ensure that modules and pages cannot interfere with one another's display, causing security issues.
  • Content restrictions header: Gervase Markham proposes a content restrictions header specification that would allow authors to express their full intent on how their content should interact with content from other sites. A compliant implementation would submit a content restrictions header containing a policy string.
  • W3C Access Control List (ACL) System: The W3C ACL System could be used as a model for an ACL-based system to govern access to HTTP-served resources in Ajax mashups.
  • Cross-domain.xml: Flash objects look for a file called cross-domain.xml on the server before they attempt to access their specified URL. This file specifies which sites can host applications that access the services provided on that server. Many Web service providers already implement this file.

See Resources for links to detailed information on these proposals.



Back to top


You can help shape the future

As developers, we all have a stake in the outcome of these discussions. By joining the conversation, you can help to design the most flexible yet secure improvements to the browser that will allow all to build robust and secure rich Web applications. I encourage you to seek out browser vendors and organizations advocating browser advances and join in:

  • Get involved with industry associations and working groups.
  • Interact with browser and tool vendors in newsgroups and forums.
  • Seek out and engage key players in the industry.

You can find links to starting points in the Resources for this article.



Resources

Learn

Get products and technologies
  • IBM trial software: Build your next development project with trial software available for download directly from developerWorks.


Discuss


About the author

Photo of Brent Ashley

Brent Ashley is a consultant and scripting specialist in the Toronto area. Involved in computers and technology since 1979, he has been active in developing and promoting rich Web application development techniques such as Ajax and remote scripting since 1999. Brent talks about technical subjects on his blog. You can reach Brent at brent@ashleyit.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top


Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc., in the United States, other countries, or both. Microsoft, ActiveX, Internet Explorer, Outlook, and Windows are trademarks of Microsoft Corporation in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others.