As an educational tool, this article will describe how to re-engineer a familiar application, email, as a Web Service using HTTP and the principles of Web Architecture and REpresentational State Transfer.
1 Basic Concepts
2 Sending a Message Directly
3 Sending the Message
4 Adding a Queuing Mailbox
5 Managing Mailboxes
6 Delivering Mail to the End-User
7 Summary of REST Benefits
8 Prior Art
Caveat #1: This article is not intended as a proposal for a replacement for email. If REST becomes a dominant methodology for networked applications, then it might be a suitable basis for a replacement for SMTP someday. But not today. Nevertheless, in understanding how the move to REST improves email, you will hopefully come to see how moving your own web services to it would improve them. In particular, these ideas may be of value to someone interested in designing asychnronous systems with HTTP.
Caveat #2: I am using Roy Fielding's name for the Web's architectural style and I am trying to stick to that style as closely as possible. But that does not imply that he neccessarily thinks that replacing SMTP with HTTP would be a good idea.
Caveat #3: These ideas are still rough. These are just a few thoughts on my part!
Anything addressable by a URI is called a resource. We get representations of information from resources, and send representations of information to them. To get a representation from a resource we use the HTTP method GET. To overwrite a resource based on a representation we use the PUT method. To delete a resource we use the DELETE method. To modify a resource based upon its current state (extend it or mutate it) we use POST.
Note that URIs are for objects (resources) not for actions. Putting any kind of method name in a URI is typically considered poor style in REST. HTTP already has four actions, GET, PUT, POST, DELETE. Another standard called WebDAV adds others like LOCK and MOVE.
Some resources are collections. They will often be represented by XML documents containing URI references to the children. When viewed in a browser, they should usually be rendered as HTML lists of links. POSTing to collections creates new URIs that are considered children of the original resource. It does not matter whether the child's URI is similar to the parent. It could live on a totally different machine for all web infrastructure cares. The parent-child relationship should be represented in the XML. The structure of the URI is irrelevant to client software. This is called "URI opacity."
Message recipients would be represented not by email addresses but by HTTP URIs. The recipients are called "recipient mailboxes" and any individual may have as many of these "recipient mailboxes" as he or she wants. In the simplest case a piece of client server directly sends a message to a recipient mailbox.
We'll start with this case and deal with additional complexity later.
An outgoing message has two parts. One is the actual content to be sent. It may be any kind of resource. The second is a notification that the content is available. An extension of the pre-existing RSS format is probably appropriate for the notification. There are reasons we split the notification from the content. That should become clear as we progress.
For instance if you want to email the words "Hello World", your email client would first make a web resource with those words and then make a notification about it:
Sender -> Sender's ISP
POST http://www.myisp.com/my_outgoing_message_content Content-Type: text/plain Hello World
Sender <- Sender's ISP
202 Accepted Location: http://www.myisp.com/my_outgoing_message_content/3432 A new Web resource has been created.
Sender -> Recipient Mailbox
POST http://www.recipient.com/incoming_messages Content-Type: text/xml+rss <notification>... <item> <description>Test message from Paul</description> <url>http://www.myisp.com/my_outgoing_message_content/3432</url> </item> </notification>
Sender <- Recipient Mailbox
202 Accepted Location: http://www.recipient.com/incoming_messages/78979 Your notification will be stored until it is retrieved by the recipient.
Other metadata may be added to the notification resource as appropriate.
We could make this a little bit more efficient by inlining the content into the notification message. We will not go into that in more detail but think of it as an optimization of a common special case.
The problem with directly delivering the message from a client mailbox to a recipient's inbox is that the recipient machine may not be online at the same time you are. Most end-users will prefer to delegate delivery to an intermediary managed by their ISP as they do today.
Once the recipient mailbox receives the notification, it would then turn around and do a GET on the message content to retrieve it (perhaps a future HTTP version would allow the reuse of the TCP connection). Of course if the content were inlined it would not have to do this.
There are a variety of reasons to push only a notification rather than a full message. One reason is that the recipient mailbox could decide based on metadata that it doesn't want the full message. Perhaps it is too large. Nevertheless, it could alert the recipient's mail user agent and that program could download the message directly when it really needs it. Also, if hundreds of people at a particular corporation are getting the same message, their server could choose to download it only once.
Also, the reliability of this solution is better. The POST is small and cheap, so there is little cost to redundantly sending it multiple times in case network problems interrupt it. Once the recipient mailbox knows about the message it will do a GET to get the fetch the body. This GET is safe and side-effect-free and therefore can also be repeated with impunity. If we used a single POST to send the data then there would be a danger of the same information being sent twice and the recipient might think that there were two different messages (though this could be avoided with intelligent use of message IDs). Another nice feature is that even if the data never gets to the mailbox they will at least know that there was an attempt to send a message which is probably better than not knowing at all.
In most cases recipients will want to do a GET immediately after getting a notification if the message is small and was not inlined. Even though the information will be on an ISP's box, it could get deleted if the user ran out of disk space or if it simply timed out waiting to be picked up.
In order to ensure that the resource only gets GETs from legitimate recipients, it must either have a cryptographically unguessable URIs or have unguessable signatures associated with each URI. For familiarity I will call the signature a "message password". Unguessable URIs are essentially a form of security mechanism known as a "capability". Capability security is simpler (just one URI rather than a URI/password pair) but it requires software and end-users to adhere to a discipline that sharing a URI is equivalent to sharing the whole message. It may be better to use the more familiar (if more complicated) message password metaphor. So the original notification POST should have a message password in it and the GET is only acknowledge if it has the right password.
If the recipient mailbox for some reason needs to report a problem with delivery or something else, it can POST that as status metadata to the message URI in order to make that known. When it has successfully received the message it should also POST that.
There is no requirement for the recipient mailbox to GET the information when it first gets the notification. It could delay the GET for disk space reasons. But even more interesting, it could delay it for "liveness" reasons. For instance what if there were a header that you could send along with an email notification that said: "this is mutable, changeable data. Download it at the last possible second." Then every time the recipient looked at the message they would see the very latest copy of the information. You could send an email with a stock price in it, and the stock price would be accurate at the time the reader read the email!
This begs the question, what exactly is the difference between an email and a web page? What if you pushed notifications of web page changes to a person's inbox? For instance if you did this for a weblog, then all of a sudden your weblog would appear in their inbox! Subscribing to a weblog would be just like subscribing to a mailing list and would use all of the same protocols underneath. You can see how REST blurs the boundaries between applications and could obliterate the boundary between "the web" and email.
A notification queuing intermediary is a web resource that we will call an "outgoing mailbox." Standard authentication should be used to ensure that only authorized users can add information to the outgoing mail box (you do not want spammers wasting your resources). Mail can be added to the outgoing mailbox by POSTing a notification with information about who it should be relayed to.
Sender -> Outgoing Mailbox
POST http://www.myisp.com/outgoing_notifications Content-Type: text/xml+rss <notification>... <to>http://www.recipient.com/recipient1</to> <to>http://www.recipient2.com/recipient2</to> <item> <description>Test message from Paul</description> <url>http://www.myisp.com/my_outgoing_message_content/msg_3432</url> </item> </notification>
When a notification message is POSTed to the outgoing mailbox, it is given a unique URI and this is returned to the mail user agent.
Sender <- Outgoing Mailbox
202 Accepted Location: http://www.myisp.com/outgoing_notifications/not_2321
The user agent may use this URI to allow the user to check on the progress of the message later.
Sender -> Outgoing Mailbox
Sender <- Outgoing Mailbox
200 OK <notification> ... <recipient status="success" href="http://www.recipient.com/recipient1"/> <recipient status="retry" href="http://www.recipient.com/recipient2"/> ... </notification>
Progress information is part of the metadata managed by the outgoing mailbox. As far as I know, there is no way to request status updates for messages in the current mail system, but this falls naturally out of the REST design because a GET (or perhaps even a HEAD) request on the URI would naturally return useful metadata about the message and progress information is useful metadata.
If the user switches from one user agent to another, the new one may be apprised of the status of all notifications by visiting the "outgoing_notifications" container. It would merely be a list of URIs:
User Agent -> Outgoing Mailbox
Outgoing Mailbox <- User Agent
<notifications> <notification status="sent" href="http://www.recipient.com/recipient1"/> <notification status="waiting" href="http://www.recipient.com/recipient2"/> ... </notifications>
The ability to bring new agents into a system is, in my opinion, a key feature of REST designs.
If the recipient mailbox is not reachable for some reason, the sender could use standard retry and quit algorithms. If a notification send fails, that failure should not be represented at the protocol level as an email back to the user. A mail user interface could represent it that way but that would be a user interface choice. Rather a message send failure should be represented as metadata on the message itself. The user's client software can do a query on the mailbox to find out how many messages are in a failure state and represent this in an appropriate fashion.
There are various potential business rules you could use for cleaning up. One might be that notifications are removed from the system as soon as they are successfully delivered or are deleted by the sender, whichever comes first. Deletion can be done using the standard HTTP "DELETE" method and in fact any WebDAV client can be used to make this natural even in the absence of a real user agent.
Rather than having a separate namespace for email addresses, incoming mailboxes would have HTTP URIs. It should be easy for a program to navigate from a web page to an associated mailbox (typically from a user's home page to their standard inbox).
<html><head><link rel="mailbox" href="mymailbox"/></head>...</html>
It should also be trivial to set up a new mailbox. For an end-user it should be a simple menu item "new mailbox", "What URI would you like it to have?" This would do a POST to a mailbox collection resource. Mailboxes could also be destroyed by DELETEing them. Mailboxes might support full WebDAV collection interfaces so that they could be managed and navigated just using WebDAV clients like Microsoft's "Web Folders" and the Office tools.
Recipient mailboxes should probably be distinct from "logical cabinets". Just as in the real world, cabinets are for organizing information that came in through mailboxes. Cabinets would be views over one or more mailboxes for navigation purposes. Each cabinet could either be a list of hypertext links into mailboxes or a "stored query". The query language would allow questions about all standard headers and full-text searching. Cabinets could live either on client machines, on the server or on totally unrelated Internet boxes. It would be common for the same message to live in multiple cabinets.
Mailboxes could have business rules associated with them. If we mapped messages and mailboxes into an RDF model internally, we could write those business rules in a standardized language like DAML+OIL. It would allow us to associated classes and subclasses with messages, check relationships (especially threading relationships) of messages and make inferences about messages. REST makes the relationship checking feasible by giving everything a URI.
A message forwarder or router would be essentially just a mailbox with business rules that say to forward a message when it meets certain criteria (including the null criteria). A queuing message forwarder falls out of the already implemented features naturally.
Mailboxes can be associated with groups rather than individuals, merely by giving the mailbox password or unguessable URI to group members. Therefore in addition to having mailing lists that push messages by forwarding them (as we do today) we can have mailing lists that operate entirely in a pull mode, waiting for list members to ask for messages. This is essentially like a single-server newsgroup. In fact, individual subscribers could choose whether to have messages forwarded to them or notifications forwarded to them, or neither. So one person would view it as a standard push-based mailing list and the other (low on bandwidth) would see it as as a pull-based newsgroup.
If we wanted, we could go through the exercise of reinventing the rest of the Network News Transfer Protocol (NNTP) on top of HTTP. And by the way it should be trivially obvious how to "emulate" FTP using the WebDAV extension of HTTP. Can you see where this is heading? All of the *TP protocols can be viewed as specializations of HTTP.
Most desktop email uses a pull model, where the client polls every few minutes for mail. This works fine with REST. But let's consider what it would look like if the mailbox pushed notifications to the client?
Basically the client would have a tiny HTTP server embedded in it. It would register with the server (after appropriate authentication) that it would like notifications whenever the mailbox changes to be sent to a particular URI. Numerous clients could be registered at the same time. The server would push notifications at that URI. See the HTTPEvents spec for a more detailed design.
Once the client knows that there is new mail, it needs to decide what mail to download. There are a few ways that a client could phrase the question: "is there any new mail for me?" A non-REST-y way is to set up some kind of persistent TCP or cookie session with the server and have the server keep track of which clients have got which messages. This is prone to connection loss and makes it difficult for an end-user to move a logical "session" between devices. The only way it could be made REST-y is if there were first-class URIs representing each client and those URIs had information about what messages the client had downloaded or not. Then any authorized client device could "act as" a particular client by manipulating that URI.
But a simpler solution is to let the client entirely keep track of what messages it has and does not have. It could use plain old client side files to remember when it last got messages. Moving the session between devices can be accomplished by moving the file. It could use a standard URL query to ask the mailbox for all messages deposited since a particular moment in time (using times provided by the server, not the client). If for any reason it is informed of the same message twice it is of course easy to filter duplicate URIs.
The traditional Internet uses basically three different protocols for managing mail: SMTP, POP and IMAP. A REST equivalent could fold these all into HTTP. After all, IMAP and POP do not do much more than GETs and DELETEs. Typically the client chooses between them based on the usage metaphor the end-user is interested in (network-based or local mailboxes). This is precisely what REST avoids. The REST philosophy is "just give everything URIs and let the client software determine the usage metaphor." If POP had been done in a REST manner there would never have been a need to invent IMAP.
Because messages, maiboxes and cabinets are fundamentally Web resources, it should be possible to view them in Web browsers. Using content negotiation, the resources could be returned as XML, HTML or plain text, depending on the client software. Some server implementations would extend the basic HTML support with a full mail user agent interface including spell checking etc. And of course most people would likely choose to access the information through standard client-machine applications most of the time.
Notice how a high-end web-based mail user interface running on top of HTTP, with a group-editable mailbox is essentially a web forum. Web forums, mailing lists and newsgroups would collapse into different interfaces for more or less the same thing.
Notice also how there is no need for separate "Web archives" of mail. The mailboxes would be the web archives. User agents would be able to work with them as easily as if they were local mailboxes.
In around 1996 I had a boss who had a habit of saying things that were years ahead of the market. We couldn't figure out how to make money from his insights. For instance he predicted that something like SGML would be the standard information and object representation of the future. The only problem is that his SGML company ran out of cash before XML came along to prove him right. He also used to say that we need to collapse all of the different ways of moving bits around the network into one protocol. FTP, HTTP, SMTP, why do we need all of these things? I didn't understand it then but I do now. His vision is still probably years away but it is nevertheless the future.
In around 1999 I went to a talk by Jim Whitehead and he made the observation that the HTTP extension WebDAV was essentially a replacement for FTP, and could for many people be a replacement for NFS. Even, if you looked at it closely could the basis for email.
Sometime in 2000 I learned that Outlook uses HTTP to talk to HoTMaiL. So Microsoft uses HTTP for mail and perhaps (if we looked closely)we might find that it is even a good REST! Microsoft Exchange also uses WebDAV.
Jeff Bone has thought quite a bit about this issue and has contributed some important ideas to this paper.
Obviously the time has come for this idea to get a more methodical airing.
HTML rendition created using stylesheets by Wendell Piez of Mulberry Technologies.