What REST is really about

According to the primary source document.

I had thought that "RESTful" meant "easily accessible with an HTTP GET, even when something isn't HTML". Shortly after a RESTafarian pointed out that there was more to it than that, I went to Brian Sletten's excellent presentation REST: Information Architecture for the 21st Century at the Semantic Technologies conference and I learned a lot more about what being RESTful implies. During the presentation I asked Brian whether Roy Fielding's 2000 doctoral thesis that originally laid out what REST was all about was readable, for a PhD thesis, and he assured me that it was.

Anyone with a basic understanding of software architecture issues can and should read Fielding's thesis.

He was right. Anyone with a basic understanding of software architecture issues can and should read Fielding's thesis. I wish I'd read it years ago. I've copied a few nice quotes here, starting with this:

Software architecture research investigates methods for determining how best to partition a system, how components identify and communicate with each other, how information is communicated, how elements of a system can evolve independently, and how all of the above can be described using formal and informal notations.

What the acronym "Representational State Transfer" really means (emphasis mine):

REST components perform actions on a resource by using a representation to capture the current or intended state of that resource and transferring that representation between components. A representation is a sequence of bytes, plus representation metadata to describe those bytes. Other commonly used but less precise names for a representation include: document, file, and HTTP message entity, instance, or variant.

It can select from a choice of representations; I now better appreciate the important role of content negotiation in REST:

This abstract definition of a resource... provides generality by encompassing many sources of information without artificially distinguishing them by type or implementation [and] allows late binding of the reference to a representation, enabling content negotiation to take place based on characteristics of the request. Finally, it allows an author to reference the concept rather than some singular representation of that concept, thus removing the need to change all existing links whenever the representation changes (assuming the author used the right identifier).

With all the talk now of REST interfaces to services that are not necessarily delivering hypertext documents, it's interesting how often the thesis talks about REST being designed around hypermedia. The thesis's introduction refers to it as "REST, a novel architectural style for distributed hypermedia systems," and also mentions this,

REST is defined by four interface constraints: identification of resources; manipulation of resources through representations; self-descriptive messages; and, hypermedia as the engine of application state.

and this:

REST was originally referred to as the "HTTP object model," but that name would often lead to misinterpretation of it as the implementation model of an HTTP server. The name "Representational State Transfer" is intended to evoke an image of how a well-designed Web application behaves: a network of web pages (a virtual state-machine), where the user progresses through the application by selecting links (state transitions), resulting in the next page (representing the next state of the application) being transferred to the user and rendered for their use.

"Resource" is a pretty commonly used term, with its position as the "R" in "RDF" being only the tip of the iceberg. So what exactly is a resource?

The resource is not the storage object. The resource is not a mechanism that the server uses to handle the storage object. The resource is a conceptual mapping—the server receives the identifier (which identifies the mapping) and applies it to its current mapping implementation (usually a combination of collection-specific deep tree traversal and/or hash tables) to find the currently responsible handler implementation and the handler implementation then selects the appropriate action+response based on the request content. All of these implementation-specific issues are hidden behind the Web interface; their nature cannot be assumed by a client that only has access through the Web interface.

This note on MIME's relationship to HTTP was interesting:

HTTP inherited its message syntax from MIME in order to retain commonality with other Internet protocols and reuse many of the standardized fields for describing media types in messages. Unfortunately, MIME and HTTP have very different goals, and the syntax is only designed for MIME's goals.

Why shouldn't you treat HTTP as a way to do Remote Procedure Calls? (And what's my new favorite adjective to put in front of "scalable"?)

What makes HTTP significantly different from RPC is that the requests are directed to resources using a generic interface with standard semantics that can be interpreted by intermediaries almost as well as by the machines that originate services. The result is an application that allows for layers of transformation and indirection that are independent of the information origin, which is very useful for an Internet-scale, multi-organization, anarchically scalable information system. RPC mechanisms, in contrast, are defined in terms of language APIs, not network-based applications.

More on his carefully chosen terms "representation" and "transfer":

HTTP is not designed to be a transport protocol. It is a transfer protocol in which the messages reflect the semantics of the Web architecture by performing actions on resources through the transfer and manipulation of representations of those resources. It is possible to achieve a wide range of functionality using this very simple interface, but following the interface is required in order for HTTP semantics to remain visible to intermediaries.

Keep in mind that this was published ten years ago, about a century in Internet time. It's more relevant than ever, and I recommend that you put it high on your reading list.

3 Comments

Exactly Bob, if I had a company, the first tasks for any new starts would be to:

1) Read the /full/ REST dissertation (and chapter 5&6 twice!).
2) Read the original design for the world wide web.
3) Read the early HTTP and HTML specs, and also as many Design Issues as possible.

Regardless of whether they're a junior developer or a time served senior architect.

The only point I will add, is remember that each RFC and specification has it's own definition of "resource" with slight difference throughout, as TimBL recently pointed out, it's not a universal term with universal meaning across all specs.

Best,

Nathan


If you want to see how this line of thinking can be applied to software development in the smallest units of functionality, look at NetKernel (http://www.1060research.com/netkernel/). That software platform is based on a REST microkernel and allows you to build all of your software this way.

-- Randy


As a matter of fact, in Brian's talk he discussed NetKernel a lot. It looks pretty cool.