Automatically enable act-on-behalf-of capabilities to REST APIs.

In 2013 I published a series of API design tips called REST lessons learned. Eight years have passed, but why not add another entry?

This one I've known about for years, but never written down. I often use it when I consult teams, and each time I'm reminded that since this seems like a recurring piece of advice, I ought to write it down.

Nutshell #

The problem, in a nutshell, relates to secured resources in a REST API. This could be any resource where the client must be authenticated before being able to access it. This design tip, however, seems to be mostly applicable when the resource in question itself represents an 'identity'.

To scope the problem, API designers rarely falter when modelling resources that seems unrelated to security or identity. For example, if you're modelling a product catalogue and you want to enable some clients to edit the catalogue, it's clear to most people that a product is unrelated to the identity of the client. Thus, people naturally design URL schemes like products/1234, and that's fine. You can make a PUT request against products/1234 to edit the resource, but you must supply credentials in order to do so.

What if, however, you want to edit your own profile information? There might be a REST resource that exposes your user name, address, bio, avatar, etc. You want to make profile information editable. How do you design the API?

API designers often design such an API based on a URL like profile, without any identifer in the URL. After all, a client must be authenticated in order to edit the resource, so the user ID will somehow be in the HTTP header (e.g. as a JSON Web Token (JWT)).

Consider, nonetheless, to include the identity in the URL.

A profile resource, then, would follow a scheme like profiles/1234. Consider identifying tenant IDs in a multi-tenant system in the same way: tenants/2345. Do this even when other IDs follow: tenants/2345/products/9876.

Typical approach, not recommended #

As outlined above, a typical design is to design an 'identity' resource without including the identification in the URL. If, for example, a client wants to change the avatar via a REST API, it might have to do it like this:

PUT /users HTTP/1.1
Content-Type: application/json
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5c[...]
{
  "bio":  "Danish software design",
  "avatar""ploeh.png"
}

The server-side code can extract the user ID and other authentication information from the Bearer token in the HTTP header. It can use this information to find the user ID and update its database. Technically, this gets the job done.

I'll outline some potential problems with such a design in a moment, but first I'll show a second example. This one is more subtle.

Imagine an online restaurant reservation system. The system enables guests to make reservations, edit them, and so on. When a potential guest attempts to make a reservation, the API should check if it can accept it. See The Maître d' kata for various conditions that may cause the restaurant to reject the reservation. One case might be that the reservation attempt is outside of the restaurant's opening hours.

Perhaps the API should expose a management API that enables the restaurant's maître d'hôtel to change the opening hours. Perhaps you decide to design the API to look like this:

PUT /restaurant HTTP/1.1
Content-Type: application/json
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5c[...]
{
  "opensAt""18:00",
  "lastSeating""21:00",
  "seatingDuration""6:00"
}

Again, the Bearer token is supposed to contain enough information about the user to enable authentication and authorisation. This also gets the job done, but might paint you into a corner.

Separation of concerns #

The problem with the above approach is that it fails to separate concerns. When modelling identity, it's easy to conflate the identity of the resource with the identity of the client interacting with it. Those are two separate concerns.

What happens, for example, if you have so much success with the above restaurant reservation system that you decide to offer it as a multi-tenant service?

I often see a 'solution' to such a requirement where API designers now require clients to supply a 'tenant ID' in the HTTP header. To make it secure, you should probably make it a claim in the JWT supplied via the Authorization header, or something to that effect.

What's wrong with that? It conflates the identity of the client with the identity of the resource. This means that you can't easily enable capabilities where a client can act on behalf of someone else.

Imagine, for example, that you have three restaurants, each a tenant: Hipgnosta, Nono, and The Vatican Cellar. It turns out, however, that Hipgnosta and Nono have the same owners, and share a single administrative employee. These restaurants wish to let that employee manage both restaurants.

With the design outlined above, the employee would have to authenticate twice in order to make changes to both restaurants. That may not be a big deal for occasional edits to two restaurants, but imagine an employee who has to manage hundreds of franchises, and the situation becomes untenable.

You should enable act-on-behalf-of capabilities. This may sound like speculative generality, but it's such a low-hanging fruit that I think you should enable it even if you don't need it right now. Just put the resource identity in the URL: restaurants/456 and users/1234.

Even for user profiles, putting the user ID in the URL enables one client to view (if not edit) other user profiles, which may or may not be desirable.

The API should still demand that clients authenticate, but now you can distinguish the resource from the client making the request. This makes it possible for a client to act on behalf of others, given the right credentials.

Restaurant schedule example #

I'll show you a slightly different example. Instead of editing a restaurant's opening or closing hours, I'll show you how the maître d' can view the schedule for a day. A previous article already suggested that such a resource might exist in a code base I've recently written. A request and its response might look like this:

GET /restaurants/1/schedule/2022/8/21 HTTP/1.1
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJyZXN0YXVyYW5[...]

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
{
  "name""Hipgnosta",
  "year": 2022,
  "month": 8,
  "day": 21,
  "days": [
    {
      "date""2022-08-21",
      "entries": [
        {
          "time""19:45:00",
          "reservations": [
            {
              "id""0cced578fa21489bb0e3b5eb6be6825a",
              "at""2022-08-21T19:45:00.0000000",
              "email""annekoicchamber@example.com",
              "name""Anne Kowics Chambers",
              "quantity": 5
            }
          ]
        }
      ]
    }
  ]
}

I've simplified the example response by removing all links to make it more readable. After all, the shape of the response is irrelevant for this discussion. The point is the interaction between the request URL and the JWT.

The request is against a URL that identifies the restaurant in question. The 1 after restaurants in /restaurants/1/schedule/2022/8/21 identifies the restaurant as Hipgnosta to the API. (In reality, clients are expected to follow links. URLs are signed with HMACs, but I've trimmed those off as well to simplify the example.)

In this multi-tenant API, each restaurant is a separate tenant. Thus, the restaurant ID is really a tenant ID. The resource is fully identified via the URL.

What about the client identity? It's supplied via the JWT, which decoded contains these claims:

{
  "restaurant": [
    "1",
    "2112"
  ],
  "role""MaitreD",
  "nbf": 1618301674,
  "exp": 1618906474,
  "iat": 1618301674
}

Notice that the restaurant array contains a list of IDs that identify the tenants that the JWT can access. This particular JWT can access both restaurants 1 and 2112, which correspond to Hipgnosta and Nono. This represents the shared employee who can act on behalf of both restaurants.

Access control #

The API checks the that the incoming JWT has a restaurant claim that matches the incoming restaurant ID. Only if that's the case will it let the request through.

[HttpGet("restaurants/{restaurantId}/schedule/{year}/{month}/{day}")]
public async Task<ActionResult> Get(int restaurantId, int year, int month, int day)
{
    if (!AccessControlList.Authorize(restaurantId))
        return new ForbidResult();
 
    // Do the real work here...

The above code fragment is a copy from another article where I already shared some of the server-side authorisation code. Here I'll show some of the code that I didn't show in the other article.

In the other article, you can see how the AccessControlList is populated from HttpContext.User, but I didn't show the implementation of the FromUser function. Here it is:

internal static AccessControlList FromUser(ClaimsPrincipal user)
{
    var restaurantIds = user
        .FindAll("restaurant")
        .SelectMany(c => ClaimToRestaurantId(c))
        .ToList();
    return new AccessControlList(restaurantIds);
}
 
private static int[] ClaimToRestaurantId(Claim claim)
{
    if (int.TryParse(claim.Value, out var i))
        return new[] { i };
    return Array.Empty<int>();
}

What you need to notice is just that the FromUser function finds and parses all the "restaurant" claims it can find. The Authorize method, subsequently, just looks for the incoming restaurantId among them:

internal bool Authorize(int restaurantId)
{
    return restaurantIds.Contains(restaurantId);
}

Thus, the identity of the resource is decoupled from the identity of the client. In this example, the client acts on behalf of two tenants, but since an array can hold an arbitrary number of values, there's no hard limit to how many tenants a single client could act on behalf of.

Conclusion #

You don't always need act-on-behalf-of security features, but you never know if such a need might emerge in the future. You're going to need to check client credentials anyway, so the only extra step to avoid painting yourself into a corner is to put the resource identity in the URL - even if you believe that the resource identity and the client identity is the same. Such assumptions have a tendency to be proven wrong over time.

I'm not usually a proponent of speculative generality, but I also think that it's prudent to consider overall return of investment. The cost of adding the resource identity to the URL is low, while having to change URL schemes later may carry a higher cost (even if you force clients to follow links).

This fits one view on software architecture: Make it as easy to make reactive changes to the system, but identify the areas where change will be hard; make good ex-ante decisions about those.

Finally, I think that there's something fundamentally correct and consistent in putting user or tenant IDs in the URLs. After all, you put all other resource IDs (such as product IDs or customer IDs) in URLs.

Notice, in the above schedule example, how the restaurant ID isn't the only ID. The URL also carries information about year, month, and date. These further identify the schedule resource.

Putting user or tenant IDs in the URL effectively separates concerns. It enables you to discern the tenant or user from the client making the request.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 19 April 2021 06:29:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 19 April 2021 06:29:00 UTC