Showing posts with label Asynchronous. Show all posts
Showing posts with label Asynchronous. Show all posts

Monday, September 9, 2013

2013-09-09: MS Thesis: HTTP Mailbox - Asynchronous RESTful Communication

It is my pleasure to report the successful completion of my Master's degree thesis entitled "HTTP Mailbox - Asynchronous RESTful Communication". I have defended my thesis on July 11th and got my written thesis accepted on August 23rd 2013. In this blog post I will briefly describe the problem that the thesis is targeting at followed by proposed and implemented solution to the problem. I will walk through an example that will illustrate the usage of the HTTP Mailbox then I will provide various links and resources to further explore the HTTP Mailbox.

Traditionally, general web services used only the GET and POST methods of HTTP while several other HTTP methods like PUT, PATCH, and DELETE were rarely utilized. Additionally, the Web was mainly navigated by humans using web browsers and clicking on hyperlinks or submitting HTML forms. Clicking on a link is always a GET request while HTML forms only allow GET and POST methods. Recently, several web frameworks/libraries have started supporting RESTful web services through APIs. To support HTTP methods other than GET and POST in browsers, these frameworks have used hidden HTML form fields as a workaround to convey the desired HTTP method to the server application. In such cases, the web server is unaware of the intended HTTP method because it receives the request as POST. Middleware between the web server and the application may override the HTTP method based on special hidden form field values. Unavailability of the servers is another factor that affects the communication. Because of the stateless and synchronous nature of HTTP, a client must wait for the server to be available to perform the task and respond to the request. Browser-based communication also suffers from cross-origin restrictions for security reasons.

We describe HTTP Mailbox, a mechanism to enable RESTful HTTP communication in an asynchronous mode with a full range of HTTP methods otherwise unavailable to standard clients and servers. HTTP Mailbox also allows for multicast semantics via HTTP. We evaluate a reference implementation using ApacheBench (a server stress testing tool) demonstrating high throughput (on 1,000 concurrent requests) and a systemic error rate of 0.01%. Finally, we demonstrate our HTTP Mailbox implementation in a human-assisted Web preservation application called "Preserve Me!" and a visualization application called "Preserve Me! Viz".

The HTTP Mailbox is inspired by the pre-Web distributed computing model Linda and modern Web scale distributed computing architecture REST. It tunnels the HTTP traffic over HTTP using message/http (or application/http) MIME type and stores the HTTP messages (requests/responses) along with some extra metadata for later retrieval. The HTTP Mailbox provides a RESTful API to send and retrieve asynchronous HTTP messages. For a quick walk-through of the thesis please refer to the oral presentation slides (HTML) or access them on SlideShare. A complete copy of the thesis (PDF) is also available publicly at:
Sawood Alam, HTTP Mailbox - Asynchronous RESTful Communication, MS Thesis, Computer Science Department, Old Dominion University, August 2013.


Our preliminary implementation code can be found on GitHub. We have also deployed an instance of our implementation on Heroku for public use. This instance internally uses Fluidinfo service for message storage. Let us have a look at the deployed service to illustrate its usage.

Let us assume that we want to check the HTTP Mailbox to see if there any messages for http://example.com/all. Our HTTP Mailbox API endpoint is located at http://httpmailbox.herokuapp.com/hm/. Hence we will make a GET request as illustrated below.

$ curl -i http://httpmailbox.herokuapp.com/hm/http://example.com/all
HTTP/1.1 404 Not Found
Content-Type: message/http
Date: Mon, 09 Sep 2013 16:59:13 GMT
Server: HTTP Mailbox
Content-Length: 0
Connection: keep-alive

This indicates that there are no messages for the given URI. Now let us POST something to that URI first. We have an example file named "welcome.txt" that is a valid HTTP message which we want to send to http://example.com/all.

$ cat welcome.txt
POST /all HTTP/1.1
Host: example.com
Content-Type: text/plain
Content-Length: 32

Welcome to the HTTP Mailbox! :-)

Now let us POST this message to the given URI.

$ curl -i -X POST --data-binary @welcome.txt \
> -H "Sender: hm-deployer" \
> -H "Content-Type: message/http" \
> http://httpmailbox.herokuapp.com/hm/http://example.com/all
HTTP/1.1 201 Created
Content-Type: message/http
Date: Mon, 09 Sep 2013 17:13:02 GMT
Location: http://httpmailbox.herokuapp.com/hm/id/ab3defce-dfa9-4d09-a72d-cac267531ca6
Server: HTTP Mailbox
Content-Length: 0
Connection: keep-alive

Now that we have POSTed the message, we can retrieve it anytime later.

$ curl -i http://httpmailbox.herokuapp.com/hm/http://example.com/all
HTTP/1.1 200 OK
Content-Type: message/http
Date: Mon, 09 Sep 2013 17:15:33 GMT
Link: <http://httpmailbox.herokuapp.com/hm/http://example.com/all>; rel="current",
 <http://httpmailbox.herokuapp.com/hm/id/ab3defce-dfa9-4d09-a72d-cac267531ca6>; rel="self",
 <http://httpmailbox.herokuapp.com/hm/id/ab3defce-dfa9-4d09-a72d-cac267531ca6>; rel="first",
 <http://httpmailbox.herokuapp.com/hm/id/ab3defce-dfa9-4d09-a72d-cac267531ca6>; rel="last"
Memento-Datetime: Mon, 09 Sep 2013 17:13:01 GMT
Server: HTTP Mailbox
Via: sent by 128.82.4.75 on behalf of hm-deployer, delivered by http://httpmailbox.herokuapp.com/hm/
Content-Length: 114
Connection: keep-alive

POST /all HTTP/1.1
Host: example.com
Content-Type: text/plain
Content-Length: 32

Welcome to the HTTP Mailbox! :-)

So far, there is only one message for the given URI. If more messages are posted to the same URI, above retrieval request will only retrieve the last message of the chain. From there the "Link" header can be used to navigate through the message chain.

We have been using HTTP Mailbox service in various applications including "Preserve Me!" and "Preserve Me! Viz". Following screenshot illustrates its usage in "Preserve Me!".


We would like to thank GitHub for hosting our code, Heroku for running our HTTP Mailbox instance on their cloud infrastructure, and Fluidinfo for storing messages in their "tag and value" style RESTful storage system.

I am grateful to my advisor Dr. Michael L. Nelson, committee members Dr. Michele C. Weigle  and Dr. Ravi Mukkamala, colleagues and everyone else who helped me in the process of getting my Master's degree. Now, I am continuing my research under the guidance of Dr. Michael L. Nelson at Old Dominion University.

Resources

--
Sawood Alam

Monday, May 13, 2013

2013-05-09: HTTP Mailbox - Asynchronous RESTful Communication

We often encounter web services that take a very long time to respond to our HTTP requests. In the case of an eventual network failure, we are forced to issue the same HTTP request again. We frequently consume web services that do not support REST. If they did, we could utilize the full range of HTTP methods while retaining the functionality of our application, even when the external API we utilize in our application changes. We sometime wish to set up a web service that takes job requests, processes long running job queues and notifies the clients individually or in groups. HTTP does not allow multicast or broadcast messaging. HTTP also requires the client to stay connected to the server while the request is being processed.

Introducing HTTP Mailbox - An Asynchronous RESTful HTTP Communication System. In a nutshell, HTTP Mailbox is a mailbox for HTTP messages. Using its RESTful API, anyone can send an HTTP message (request or response) to anyone else independent of the availability, or even the identity of recipient(s). The HTTP Mailbox stores these messages and delivers them on demand. Each HTTP message is encapsulated in the body of another HTTP message and sent to the HTTP Mailbox using a POST method. Similarly, the HTTP Mailbox encapsulates the HTTP message in the body of its response when a GET request is made to retrieve the messages.

Tunneling HTTP traffic over HTTP was also explored in the Relay HTTP. But the Relay HTTP relays the live HTTP traffic back and forth and does not store HTTP messages. It works like a proxy server to only overcome JavaScript's cross-origin restriction in Ajax requests. The Relay HTTP still requires the client and server along with the relay server to meet in time.

Store and forward nature of the HTTP Mailbox is inspired by Linda. We have taken the simplicity of Linda model and implemented it using HTTP on the scale of the Web. This approach has enabled asynchronous, indirect, time-uncoupled, space-uncoupled, individual, and group communication over HTTP. Time-uncoupling refers to no need of sender and recipient(s) meeting in time to communicate while space-uncoupling refers to no need of sender and recipient(s) knowing each other's identity to communicate. The HTTP Mailbox also enabled utilization of the full range of HTTP methods otherwise unavailable to standard clients and servers.


The above figure shows the lifecycle of a typical HTTP message using the HTTP Mailbox in four steps. We will walk through this example to explain how it works. Assume that the client wants to send the following HTTP message to the server at example.com/tasks/1.

> PATCH /tasks/1 HTTP/1.1
> Host: example.com
> Content-Type: text/task-patch
> Content-Length: 11
> 
> Status=Done

Step 1, assuming that the HTTP Mailbox server is running on example.net, therefore the message will be encapsulated in a POST message like this:

> POST /hm/http://example.com/tasks HTTP/1.1
> Host: example.net
> HM-Sender: http://example.org/alice
> Content-Type: message/http; msgtype: request
> Content-Length: 108
> 
> PATCH /tasks/1 HTTP/1.1
> Host: example.com
> Content-Type: text/task-patch
> Content-Length: 11
> 
> Status=Done

Step 2, the HTTP Mailbox will store the message and return the URI of newly created message in the response as follows:

< HTTP/1.1 201 Created
< Location: http://example.net/hm/id/5ecb44e0
< Date: Thu, 20 Dec 2012 02:22:56 GMT
 
Step 3, example.com makes an HTTP GET request to the HTTP Mailbox server to retrieve its messages. To retrieve the most recent message sent to "http://example.com/tasks" a request will look like this:

> GET /hm/http://example.com/tasks HTTP/1.1
> Host: example.net

Step 4, the response from the HTTP Mailbox will contain the most recent message sent to "http://example.com/tasks".  The response will also include a "Link" header that will give the URLs to navigate through the chain of messages for that recipient.

< HTTP/1.1 200 OK
< Date: Thu, 20 Dec 2012 02:10:22 GMT
< Link: <http://example.net/hm/id/aebed6e9>; rel="first",
<  <http://example.net/hm/id/5ecb44e0>; rel="last self",
<  <http://example.net/hm/id/85addc19>; rel="previous",
<  <http://example.net/hm/http://example.com/tasks>; rel="current"
< Via: Sent by 127.0.0.1
<  on behalf of http://example.org/alice
<  delivered by http://example.net/
< Content-Type: message/http; msgtype: request
< Content-Length: 108
< 
< PATCH /tasks/1 HTTP/1.1
< Host: example.com
< Content-Type: text/task-patch
< Content-Length: 11
< 
< Status=Done

A tech report is published on arXiv, describing the HTTP Mailbox in details. A reference implementation of the HTTP Mailbox can be found on GitHub.

We have already used the HTTP Mailbox successfully in the following applications.
  • Preserve Me! - a distributed web object preservation system that establishes social network among web objects and uses the HTTP Mailbox for its communication needs.
  • Preserve Me! Viz - a dynamic and interactive network graph visualization tool to give insight of the Preserve Me! graph and communication.


Where else can we use the HTTP Mailbox?
  • Warrick - a tool to restore lost websites. It can use HTTP Mailbox to accept service requests and status notification.
  • Carbon Dating the Web - a tool to find out the age of a resource at a given URL. This process usually takes few minutes to complete each request in the queue. It can utilize the HTTP Mailbox to accept service requests and send the response when ready.
  • Device notifications - related to software updates, general application messaging.
  • Any application that needs asynchronous RESTful HTTP messaging.

Resources

--
Sawood Alam