Discoverability in API design

There are a handful of questions a user will (implicitly) ask when using your API:

  1. What actions can I do against this endpoint?
  2. How do I find the URLs for those actions?
  3. What information do I need to provide in order to perform this action?
  4. What permission do I need in order to perform this action.

Answering these questions can be automated. The user, and the tools they use, can discover the answers by working with the system. That is what I mean when I use the word “Discoverability.”

We missed some opportunities to answer these questions when we designed the APIs for Keystone OpenStack. I’d like to talk about how to improve on what we did there.

First I’d like to state what not to do.

Don’t make the user read the documentation and code to an external spec.

Never require a user to manually perform an operation that should be automated. Answering every one of those question can be automated. If you can get it wrong, you will get it wrong. Make it possible to catch errors as early as possible.

Lets start with the question: “What actions can I do against this endpoint?” In the case of Keystone, the answer would be some of the following:

Create, Read, Update and Delete (CRUD) Users, Groups of Users, Projects, Roles, and Catalog Items such as Services and Endpoints. You can also CRUD relationships between these entities. You can CRUD Entities for Federated Identity. You can CRUD Policy files (historical). Taken in total, you have the tools to make access control decisions for a wide array of services, not just Keystone.

The primary way, however, that people interact with Keystone is to get a token. Let’s use this use case to start. To Get a token, you make a POST to the $OS_AUTH_URL/v3/auth/tokens/ URL. The data

How would you know this? Only by reading the documentation. If someone handed you the value of their OS_AUTH_URL environment variable, and you looked at it using a web client, what would you get? Really, just the version URL. Assuming you chopped off the V3:

$ curl http://10.76.10.254:35357/
{"versions": {"values": [{"id": "v3.14", "status": "stable", "updated": "2020-04-07T00:00:00Z", "links": [{"rel": "self", "href": "http://10.76.10.254:35357/v3/"}], "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}]}]}}

and the only URL in there is the version URL, which gives you back the same thing.

If you point a web browser at the service, the output is in JSON, even though the web browser told the server that it preferred HTML.

What could this look like: If we look at the API spec for Keystone:  We can see that the various entities referred to Above hat fairly predictable URL forms. However, for this use case, we want a token, so we should, at a minimum, see the path to get to the token. Since this is the V3 API, we should See an entry like this:

{"rel": "auth", "href": "http://10.76.10.254:35357/v3/auth"}], "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}]}

And is we then performed an HTTP GET on http://10.76.10.254:35357/v3/auth we should see a link to :

{"rel": "token", "href": "http://10.76.10.254:35357/v3/auth/token"}], "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}]}

Is this 100% of the solution? No. The Keystone API shows its prejudices toward PASSWORD based authentication, a very big antipattern. The Password goes in clear text into the middle of the JSON blob posted to this API. We trust in SSL/TLS to secure it over the wire, and have had to erase from logs and debugging. This is actually a step backwards from BASIC_AUTH in HTTP. All this aside, there is still no way to tell what you need to put into the body of the token request without reading the documentation….unless you know the magic of JSON-HOME.

Here is what you would need to do to get a list of the top level URLS, excluding all the ones that are templated, and thus require knowing an ID.

curl 10.76.116.63:5000 -H "Accept: application/json-home" | jq '.resources | to_entries | .[] | .value | .href ' | sort -u
  • “/v3/auth/catalog”
  • “/v3/auth/domains”
  • “/v3/auth/OS-FEDERATION/saml2”
  • “/v3/auth/OS-FEDERATION/saml2/ecp”
  • “/v3/auth/projects”
  • “/v3/auth/system”
  • “/v3/auth/tokens”
  • “/v3/auth/tokens/OS-PKI/revoked”
  • “/v3/credentials”
  • “/v3/domains”
  • “/v3/domains/config/default”
  • “/v3/ec2tokens”
  • “/v3/endpoints”
  • “/v3/groups”
  • “/v3/limits”
  • “/v3/limits/model”
  • “/v3/OS-EP-FILTER/endpoint_groups”
  • “/v3/OS-FEDERATION/identity_providers”
  • “/v3/OS-FEDERATION/mappings”
  • “/v3/OS-FEDERATION/saml2/metadata”
  • “/v3/OS-FEDERATION/service_providers”
  • “/v3/OS-OAUTH1/access_token”
  • “/v3/OS-OAUTH1/consumers”
  • “/v3/OS-OAUTH1/request_token”
  • “/v3/OS-REVOKE/events”
  • “/v3/OS-SIMPLE-CERT/ca”
  • “/v3/OS-SIMPLE-CERT/certificates”
  • “/v3/OS-TRUST/trusts”
  • “/v3/policies”
  • “/v3/projects”
  • “/v3/regions”
  • “/v3/registered_limits”
  • “/v3/role_assignments”
  • “/v3/role_inferences”
  • “/v3/roles”
  • “/v3/s3tokens”
  • “/v3/services”
  • “/v3/users”

This would be the friendly list to return from the /v3 page. Or, if we wanted to streamline it a bit for human consumption, we could put a top level grouping around each of these APIs. A friendlier list would look like this (chopping off the /v3)

  • auth
  • assignment
  • catalog
  • federation
  • identity
  • limits
  • resource
  • assignment
  • policy

There are a couple ways to order the list. Alphabetical order is the simplest for an English speaker if they know what they are looking for. This won’t internationalize, and it won’t guide the user to the use cases that are most common. Thus, I put auth at the top, as that is, by far, the most common use case. The others I have organized based on a quick think-through from most to least common. I could easily be convinced to restructure this a couple different ways.

However, we are starting to trip over some of the other aspects of usability. We have provided the user with way more information than they need, or, indeed, can use at this point. Since none of those operations can be performed unauthenticated, we have lead the user astray; we should show them, at this stage, only what they can do in their current state. Thus: the obvious entry would be.

  • /v3/auth/tokens.
  • /v3/auth/OS-FEDERATION
As these are the only two directions they can go unauthenticated.

Lets continue on with the old-school version of a token request using the v3/auth/tokens resource, as that is the most common use case. How now does a user request a token? Depends on whether they want to use password or another token, or multifactor, and whether they want an unscoped token or a scoped token.

None of this information is in the JSON home. You have to read the docs.

If we were using straight HTML to render the response, we would expect a form. Something along the lines of:

There is, as of now, no standard way to put form data into JSON. However, there are numerous standards to chose from. One such standard is FormData API. JSON Scheme https://json-schema.org/. If we look at the API do, we get a table that specifies the name. Anything that is not a single value is specified as an object, which really means a JSON object which is a dictionary that can bee deeply nested. We can see the complexity in the above form, where the scope value determines what is meant by the project/domain name field. And these versions don’t allow for IDs to be used instead of the names for users, projects, or domains.

A lot of the custom approach here is dictated by the fact that Keystone does not accept standard authentication. The Password based token request could easily be replaced with BASIC-AUTH. Tokens themselves could be stored as session cookies, with the same timeouts as the token expiration. All of the One-Offs in Keystone make it more difficult to use, and require more application specific knowledge.

Many of these issues were straightened out when we started doing federation. Now, there is still some out-of-band knowledge required to use the Federated API, but this was due to concerns about information leaking that I am going to ignore for now. The approach I am going to describe is basically what is used by any app that allows you to log in from the different cloud providers Identity sources today.

From the /v3 page, a user should be able to select the identity provider that they want to use. This could require a jump to /v3/FEDERATION and then to /v3/FEDERATION/idp, in order to keep things namespaced, or the list could be expanded in the /v3 page if there is really nothing else that a user can do unauthenticated.

Let us assume a case where there are three companies that all share access to the cloud; Acme, Beta, and Charlie. The JSON response would be the same as the list identity providers API. The interesting part of the result is this one here:

 "protocols": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME/protocols"

Lets say that a given Identity provider supports multiple protocols. Here is where the user gets to chose which hone they want to use to try and authenticate. An HTTP GET on the link above would return that list: The documentation shows an example of an identity provider that supports saml2. Here is an expanded one that shows the set of protocols a user could expect in a private cloud running FreeIPA and Keycloak, or Active Directory and ADFS.

{
    "links": {
        "next": null,
        "previous": null,
        "self": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME/protocols"
    },
    "protocols": [
        {
            "id": "saml2",
            "links": {
                "identity_provider": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME",
                "self": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME/protocols/saml2"
            },
            "mapping_id": "xyz234"
        },
        {
            "id": "x509",
            "links": {
                "identity_provider": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME",
                "self": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME/protocols/x509"
            },
            "mapping_id": "xyz235"
        },
        {
            "id": "gssapi",
            "links": {
                "identity_provider": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME",
                "self": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME/protocols/gssapi"
            },
            "mapping_id": "xyz236"
        },
        {
            "id": "oidc",
            "links": {
                "identity_provider": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME",
                "self": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME/protocols/oidc"
            },
            "mapping_id": "xyz237"
        },
        {
            "id": "basic-auth",
            "links": {
                "identity_provider": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME",
                "self": "http://example.com/identity/v3/OS-FEDERATION/identity_providers/ACME/protocols/basic-auth"
            },
            "mapping_id": "xyz238"
        }
    ]
}

Note that this is very similar to the content that a web browser gives back in a 401 response: the set of acceptable authentication mechanisms. I actually prefer this here, as it actually allows the user to select the appropriate mechanism for the use case, which may vary depending on where the use connects from.

Lets ignore the actual response from the above links and assume that, if the user is unauthenticated, they merely get a link to where they can authenticate. /v3/OS-FEDERATION/identity_providers/{idp_id}/protocols/{protocol_id}/auth. The follow on link is a GET. Not a POST. There is no form Data required. The mapping resolves the users Domain Name/ID, so there is no need to provide that information, and the token is a Federated unscoped token.

The actual response contains the list of groups that a user belongs to. This is an artifact of the mapping, and it is useful for debugging. However, what the user has at this point is, effectively, an unscoped token. It is passed in the X-Subject-Token header, and not in the session cookie. However, for an HTML based workflow, and, indeed, for sane HTTP workflows against Keystone, a session scoped cookie containing the token would be much more useful.

With an unscoped token, a user can perform some operations against a Keystone server, but those operations are either read-only, operations specific to the user, or administrative actions specific to the Keystone server. For OpenStack, the vast majority of the time the user is going to Keystone to request a scoped token to use on one of the other services. As such, the user probably needs to convert the unscoped token shown above to a token scoped to a project. A very common setup has the user assigned to a single project. Even if they are scoped to multiple, it is unlikely that they are scoped to many. Thus, the obvious next step is to show the user a URL that will allow them to get a token scoped to a specific project.

Keystone does not have such a URL. In Keystone, again you are required to go through /v3/auth/tokens to request a scoped token.

A much friendlier URL scheme would be /v3/auth/projects which lists the set of projects a user can request a token for, and /v3/auth/project/{id} which lets a user request a scoped token for that project

However, even if we had such a URL pattern, we would need to direct the user to that URL. There are two distinct use cases. The first is the case where the user has just authenticated, and in the token response, they need to see the project list URL. A redirect makes the most sense, although the list of projects could also be in authentication response. However, the user might also be returning to the Keystone server from some other operation, still have the session cookie with the token in it, and start at the discovery page again. IN this case, the /v3/ response should show /v3/auth/projects/ in its list.

There is, unfortunately, one case where this would be problematic. With Hierarchical projects, a single assignment could allow a user to get a token for many projects. While this is a useful hack in practice, it means that the project list page could get extremely long. This is, unfortunately also the case with the project list page itself; projects may be nested, but the namespace needs to be flat, and listing projects will list all of them, only the parent-project ID distinguishes them. Since we do have ways to do path nesting in HTTP, this is a solvable problem. Lets lump the token request and the project list APIs together. This actually makes a very elegant solution;

Instead of /v3/auth/projects we put a link off the project page itelf back to /v3/auth/tokens but accepting the project ID as a URL parameter, like this: /v3/auth/tokens?project_id=abc123.

Of course, this means that there is a hidden mechanism now. If a user wants to look at any resource in Keystone, they can do so with an unscoped token, provided they have a role assignment on the project or domain that manages that object.

To this point we have discussed implicit answers to the questions of finding URLs and discovering what actions a user can perform. For the token request, is started discussing how to provide the answer to “What information do I need to provide in order to perform this action?” I think now we can state how to do that: the list page for any collection should either provide an inline form or a link to a form URL. The form provides the information in a format that makes sense for the content type. If the user does not have the permission to create the object, they should not see the form. If the form is on a separate link, a user that cannot create that object should get back a 403 error if they attempt to GET the URL.

If Keystone had been written to return HTML when hit by a browser instead of JSON, all of this navigation would have been painfully obvious. Instead, we subscribed to the point of view that UI was to be done by the Horizon server.

There still remains the last question: “What permission do I need in order to perform this action?” The user only thinks to answer this question when they come across an operation that they cannot perform. I’ll did deeper into this in the next article


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.