REST lesson learned: Consider a self link on all resources

Add a self-link on RESTful resource.

This suggestion is part of my REST lessons learned series of blog posts. Contrary to previous posts, this advice doesn't originate from a lesson learned the hard way, but is more of a gentle suggestion.

One of the many advantages of a well-designed REST API is that it's 'web-scale'. The reason it's 'web-scale' is because it can leverage HTTP caching.

HTTP caching is based on URLs. If two URLs are different, they represent different resources, and can't be cached as one. However, conceivably, a client could arrive at the same resource via a number of different URLs:

If a client follows redirect, it may not arrive at the URL it originally requested.
If a client builds URLs from templates, it may order query string parameters in various ways.

If your API serves representations of the same resource for too many URLs, it could hurt cacheability. It may also confuse clients.

Consider, as courtesy, adding a self-link to all resources. Adding a self link is as easy as this:

<product xmlns="http://fnaah.ploeh.dk/productcatalog/2013/05"
         xmlns:atom="http://www.w3.org/2005/Atom">
  <atom:link href="http://catalog.api.ploeh.dk/products/1337"
             rel="self" />
  <atom:link href="http://catalog.api.ploeh.dk"
             rel="http://catalog.api.ploeh.dk/docs/rels/home" />
  <atom:link href="http://catalog.api.ploeh.dk/categories/chocolate"
             rel="http://catalog.api.ploeh.dk/docs/rels/category" />
  <atom:link href="http://catalog.api.ploeh.dk/categories/gourmet"
             rel="http://catalog.api.ploeh.dk/docs/rels/category" />
  <name>Fine Criollo dark chocolate</name>
  <price>50 DKK</price>
  <weight>100 g</weight>
</product>

Notice the link with the rel value of "self"; from its href value you now know that the canonical URL of that product resource is http://catalog.api.ploeh.dk/products/1337.

Particularly when there are (technically) more than one URL that will serve the same content, do inform the client about the canonical URL. There's no way a client can tell which URL variation is the canonical URL. Only the API knows that.

Redirected clients #

A level 3 RESTful API can evolve over time. As I previously described, you may start with URLs like http://foo.ploeh.dk/orders/1234 only to realize that in a later version of the API, you'd rather want it to be http://foo.ploeh.dk/customers/1234/orders.

As the RESTful Web Services Cookbook explains, you should keep URLs cool. In practice, that means that if you change the URLs of your API, you should at least leave a 301 (Moved Permanently) at the old URL.

If you imagine a mature API, this may have happened more than once, which means that a client arriving at one of the original URLs may be redirected several times.

In your browser, you know that if redirects happen, the address bar will (normally) display the final URL at which you arrived. However, consider that REST clients are applications. It will be implementation-specific whether or not the client realizes that the value of the final address isn't the URL originally requested.

As a courtesy to clients, do consider informing them of the final address at which they arrived.

URL variations #

While the following scenario isn't applicable for level 3 RESTful APIs, level 1 and 2 services require clients to construct URLs from templates. As an example, such services may define a product search resource as /products/search?q={query}&p={page}. To search for chocolate (and see the third (zero-indexed) page) you could construct the URL as /products/search?q=chocolate&p=2. However, most platforms (e.g. the ASP.NET Web API) would handle /products/search?p=2&q=chocolate in exactly the same way. The more query parameters you allow, the more permutations are possible.

This is a well-known problem in search (and computer science); it's called Canonicalization. Only the API knows the canonical value of a given URL: do consider informing the client about this value.

Summary #

Adding a self-link to each resource isn't much of a burden for the service, but could provide advantages for both the service and its clients. In most cases I'd expect the ROI to be high, simply because the investment is low.

Comments

Ludovic Fleury #

Why not using the Content-Location header for this? It the perfect candidate. As for the mutliple redirections scenario, the client would have know after each 301 what Location it targets? NB: the PR to comment is fun, yet it's very annoying.

2014-02-13 22:45 UTC

Mark Seemann #

The primary reason I didn't mention the Content-Location header here was that, frankly, I wasn't aware of it. Now that I've read the specification, I'll have to take your word for it :) You may be right, in which case there's no particular reason to prefer a self-link over a Content-Location header.

When it comes following links, it's true that a protocol-near client would know the location after each redirect. The question is, if you're using an HTTP library, do you ever become aware of this?

Consider a client, using a hypothetical HTTP client library:

client.FollowRedirects = true;
var response = client.Get("/foo");

The response may actually be the result of various redirects, so that the final response is from "/bar", but unless the client is very careful, it may never notice this.

2014-02-14 16:43 UTC

Published: Friday, 03 May 2013 10:32:00 UTC

REST lesson learned: Consider a self link on all resources by Mark Seemann

Redirected clients #

URL variations #

Summary #

Comments

Wish to comment?

Published

Tags