Keep REST API URLs evolvable. A way to address Hyrum's law.

Publishing and maintaining a true (level 3) RESTful API is difficult. This is the style of REST design where clients are expected to follow links to perform the work they want to accomplish; not assemble URLs from templates.

Have you ever designed a URL scheme and published it, only to later discover that you wished you'd come up with a different structure? If you've published a set of URL templates, changing your mind constitutes a breaking change. If clients follow links, however, URLs are opaque and you can redesign the URLs without breaking existing clients.

You can try to document this design principle all you want, to no avail. You can tell client developers that they're supposed to follow links, not try to retro-engineer the URLs, and still they'll do it.

I know; I've experienced it. When we later changed the URL structure, it didn't take long for the client developers to complain that we broke their code.

Hyrum's law #

This is an example of Hyrum's law in action, albeit on the scale of web service interactions, rather than low-level APIs. The presence of a discernible system to URLs suggests an implicit interface.

Consider this 'home' resource for an online restaurant reservation system:

GET / HTTP/1.1

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
{
  "links": [
    {
      "rel""urn:reservations",
      "href""http://localhost:53568/reservations"
    },
    {
      "rel""urn:year",
      "href""http://localhost:53568/calendar/2020"
    },
    {
      "rel""urn:month",
      "href""http://localhost:53568/calendar/2020/10"
    },
    {
      "rel""urn:day",
      "href""http://localhost:53568/calendar/2020/10/23"
    }
  ]
}

It doesn't take much interaction with the API before you realise that there's a system to the URLs provided in the links. If you want to see the calendar for a specific date, you can easily retro-engineer the URL template /calendar/{yyyy}/{MM}/{dd}, and /calendar/{yyyy}/{MM} for a month, and so on.

The same is likely to happen with the reservations link. You can POST to this link to make a new reservation:

POST /reservations HTTP/1.1
Content-Type: application/json
{
  "at""2020-12-09 19:15",
  "email""rainboughs@example.com",
  "name""Raine Burroughs",
  "quantity": 5
}

HTTP/1.1 201 Created
Content-Type: application/json
Location: http://localhost:53568/reservations/fabc5bf63a1a4db38b95deaa89c01178
{
  "id""fabc5bf63a1a4db38b95deaa89c01178",
  "at""2020-12-09T19:15:00.0000000",
  "email""rainboughs@example.com",
  "name""Raine Burroughs",
  "quantity": 5
}

Notice that when the API responds, its Location header gives you the URL for that particular reservation. It doesn't take long to figure out that there's a template there, as well: /reservations/{id}.

So client developers may just store the ID (fabc5bf63a1a4db38b95deaa89c01178) and use the implied template to construct URLs on the fly. And who can blame them?

That, however, misses the point of REST. The ID of that reservation isn't fabc5bf63a1a4db38b95deaa89c01178, but rather http://localhost:53568/reservations/fabc5bf63a1a4db38b95deaa89c01178. Yes, the URL, all of it, is the ID.

Evolving URLs #

Why does that matter?

It matters because you're human, and you make mistakes. Or, rather, it's intrinsic to software development that you learn as you work. You'll make decisions at the beginning that you'll want to change as you gain more insight.

Also, requirements change. Consider the URL template scheme implied by the above examples. Can you spot any problems? Would you want to change anything?

Imagine, for example, that you've already deployed the first version of the API. It's a big success. Now the product owner wants to expand the market to more restaurants. She wants to make the service a multi-tenant API. How does that affect URLs?

In that new context, perhaps URLs like /restaurants/1/reservations or /restaurants/90125/calendar/2020/10 would be better.

That, however, would be a breaking change if clients construct URLs based on implied templates.

Couldn't you just pass the restaurant ID as an HTTP header instead of in the URL? Yes, technically you could do that, but that doesn't work well with HTTP caching. It's not a RESTful thing to do, for that, and other, reasons.

Fitness #

Do we just give up in the face of Hyrum's law? Or can we keep URLs evolvable? In evolution, organisms evolve according to a 'fitness function', so to name such URLs, we could call them fit URL.

To keep URLs fit, we must prevent client developers from retro-engineering the implied interface. My original thought was to give each URL an opaque ID, such as a GUID, but in 2015 Dan Kubb instead suggested to sign the URLs. What a great idea!

If you do that, then the above home resource might look like this:

{
  "links": [
    {
      "rel""urn:reservations",
      "href""http://localhost:53568/restaurants/1/reservations?sig=1WiLlS5705bfsffPzaFYLwntrS4FCjE5CLdaeYTHxxg%3D"
    },
    {
      "rel""urn:year",
      "href""http://localhost:53568/restaurants/1/calendar/2020?sig=eIFuUkb6WprPrp%2B4HPSPaavcUdwVjeG%2BKVrIRqDs9OI%3D"
    },
    {
      "rel""urn:month",
      "href""http://localhost:53568/restaurants/1/calendar/2020/10?sig=mGqkAjY7vMbC5Fr7UiRXWWnjn3pFn21MYrMagpdWaU0%3D"
    },
    {
      "rel""urn:day",
      "href""http://localhost:53568/restaurants/1/calendar/2020/10/23?sig=Ua5F%2FucP6zmAy219cHa4WG7zIcCa0hgVD5ModXcNQuo%3D"
    }
  ]
}

Even if you can still figure out what the URL templates are, it doesn't avail you. Creating a new reservation may return a URL like https://localhost:53568/restaurants/1/reservations/fabc5bf63a1a4db38b95deaa89c01178?sig=8e80PmVi8aSS1UH6iSJ73nHmOsCLrUMs7yggEOkvEqo%3D, but you can't just replace the ID with another ID and expect it to work:

GET /restaurants/1/reservations/79520877ef4f4acdb69838e22ad04510?sig=8e80PmVi8aSS1UH6iSJ73nHmOsCLrUMs7yggEOkvEqo%3D HTTP/1.1

HTTP/1.1 404 Not Found

You're requesting a URL that doesn't exist, so the result is 404 Not Found. To be clear: yes, there is a reservation with the ID 79520877ef4f4acdb69838e22ad04510, but its URL isn't the above URL.

ASP.NET implementation #

In two articles, I'll show you how to implement fit URLs in ASP.NET Core.

The ASP.NET framework comes with enough extensibility points to make this a non-intrusive operation. I implemented this using a filter and a few Decorators in a way so that you can easily turn the feature on or off.

Conclusion #

One of the major benefits of true RESTful API design is that it's evolvable. It enables you to learn and improve as you go, without breaking existing clients.

You have to take care, however, that clients don't retro-engineer the URL templates that you may be using for implementation purposes. You want to be able to change URLs in the future.

Hyrum's law suggests that clients will rely on undocumented features if they can. By signing the URLs you keep them fit to evolve.

Next: Signing URLs with ASP.NET.


Comments

I see how signing the URLs prevents clients from retro-engineering the URL templates, but how does it help preventing breaking changes? If the client stores the whole URL instead of just the ID and later the URL changes because a restaurant ID is added, the original URL is still broken, isn't it?

2020-10-27 07:20 UTC

Daniel, thank you for writing. You're correct when assuming that clients are allowed to 'bookmark' URLs. An implicit assumption that I didn't explicitly state is that clients are supposed to follow not only links, but also redirects. Thus, to avoid breaking changes, it's the API's responsibility to leave a 301 Moved Permanently response behind at the old address.

As a service owner, though, you have some flexibility in how to achieve this. You can code this into the service code itself, but another option might be to use a reverse proxy for such purposes. One of the many advantages of REST is that you can offload a lot HTTP-level behaviour on standard networking software; here's another example.

2020-10-28 7:15 UTC

I think I have the same confusion as Daniel.

If you've published a set of URL templates, changing your mind constitutes a breaking change. If clients follow links, however, URLs are opaque and you can redesign the URLs without breaking existing clients.

...

You're correct when assuming that clients are allowed to 'bookmark' URLs. An implicit assumption that I didn't explicitly state is that clients are supposed to follow not only links, but also redirects. Thus, to avoid breaking changes, it's the API's responsibility to leave a 301 Moved Permanently response behind at the old address.

Is there a certain kind of breaking change that exists for a level 2 API that doesn't exist for a level 3 API?

2020-10-28 22:55 UTC

Tyson, thank you for writing. I haven't thought much about whether there are categories of errors that differ between the levels, but I think that in practice, there's a difference in mindset.

With a level 2 API, I think most people are still caught in a mindset that's largely a projection of RPC. URL templates map easily to programming procedures (e.g. object-oriented methods). Thus, you have a mental model that includes PostReservation, GetCalendar, etc. People think of that as being the API. With that mindset, I don't think that many client developers configure their HTTP clients to follow redirects. Thus, one could argue that changing URLs are breaking changes for a level 2 API, even if you leave 301 Moved Permanently responses behind.

With level 3 APIs, you encourage client developers to think in terms of 'the web'. That includes following links and redirects. I believe that there's a difference in perception, even if there may not be any technical difference.

I do believe, however, that the real advantage is that you impose a smaller maintenance burden on client developers with a level 3 API. Granted, a client developer may have to spend a little more effort up front to follow links, but once a compliant client is in place, it needs no more maintenance. It'll just keep working.

Not so with published URL templates. Here you have to entice client developers to actively update their code. This may be impossible if you don't know who the clients are, or if the client software is in maintenance mode. This may make it harder to retire old versions. You may be stuck with these old versions forever.

2020-10-30 17:32 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 26 October 2020 06:19:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 26 October 2020 06:19:00 UTC