What to Use as Identifiers (IDs) in REST APIs

identifiers-everyones-a-numberTIP: Publicly exposed identifiers (IDs), such as those exposed in your RESTful URLs, should not expose underlying technology. And in most cases, should not contain business meaning.

For a long time it’s been good practice to ensure that primary keys in your database tables do not contain business semantics so that it doesn’t change when the business meaning changes.  We’ve all seen the cases where either SSN, phone number, or email address utilized as a primary key turns out to be a bad choice.

If you’re familiar with this practice, then it’s not news to you that IDs exposed in REST APIs generally have the same rules–a URL is supposed to uniquely identify a resource and not change over time.

The Problem

The catch is that often we end up exposing our underlying technology in those identifiers. Consider the case where we have a URL like /users/12345, where the 12345 is a user identifier.  Is that the underlying database column, perhaps MySQL, where the user_id column is a long or AUTOSEQUENCE?  Problemo when you reimplement your user resources in MongoDB and decide to use the MongoDB ObjectID which looks something like 4e20885deabfa3a2586b5fb1.

Even more subtle is when we expose a mixture of some MySQL tables, MongoDB documents, Cassandra keyspaces, or Redis objects and expose their IDs in URLs. Your RESTful APIs can look quite cluttered and be confusing to your consumers.

The Proposal

Nearly all databases, whether NoSQL or relational, now support the concept of a UUID (universally-unique identifier).  It is a 16-byte (128-bit) binary representation, when displayed as a string is 32 hexadecimal digits, displayed in five groups separated by hyphens for a total of 36 characters (32 alphanumeric characters plus four hyphens). For example (see the wikipedia link above for more detail):

550e8400-e29b-41d4-a716-446655440000

While those 36 characters are cumbersome to type, using a Type 4 (random) UUID as primary keys, row keys, etc. in databases makes things appear much more cohesive where unique identifiers are exposed in your RESTful APIs.  And besides, most of the time the ID is simply used by a JavaScript or other consumer that doesn’t care how long the IDs are in your URL.

One More Thing

Don’t like all those characters and hyphens in your URL?  Well, you can Base64 encode your UUIDs before displaying them in URLs and Base64 decode them on the way back in–except that Base64 is not URL safe!  So you have to either URL encode/decode them or use a URL-safe Base64 encoder/decoder (like that available in RestExpress’s own RepoExpress UuidConverter).  This will get your UUIDs down to 22 characters instead of the normal 36.

Recommendation: IDs generated by an API are in the form of web-safe, base64-encoded UUIDs, which are 22 characters in length. For example, “abcdEFh4520juieUKHWgJQ” instead of “550e8400-e29b-41d4-a716-446655440000.”

Feedback?  What are your thoughts and experiences.  Please offer your comments below…

REST API Versioning: Good, Bad or Ugly?

The Prelims

There are a lot of efforts and words spent around making sure we account for the changes of business in our RESTful services.  One of those ‘contingencies’ we often account for is versioning of resources and/or representations.  Historically, versioning recommendations were often the following:

  • Place a version number, something like ‘v1’, in the URL, presumably high in the URL node hierarchy.  For example, api.example.com/v1/users/{id}
  • Version early–as in right from the start.  Meaning that the initial release should have a version in the URL.

This does have some advantages.  First up, it’s obvious which version is in play.  It could easily be described as “the path of least surprise” as it essentially makes the version part of every request.  It makes for easy testing via a browser, curl or JavaScript client (such as JQuery) since there are no additional settings to adjust before making the request.

However, as our APIs mature and become increasingly linked using hypermedia concepts (HATEOAS), version numbers in the URL cause issues when one API links to another and each doesn’t get a version bump at the same time.  It’s almost impossible to support a cohesive migration strategy when linking is involved.  With a lot of services in play, this option gets completely untenable.  Additionally, from a completely academic REST standpoint, a URL is supposed to be the identifier for a resource.  So, conceptually, a URL with a version number in it doesn’t accurately identify a resource–the URL serves more-than a single purpose of identification of a resource and instead also identifies the representation “shape.”

This technique flies in the face of the REST constraints as it doesn’t embrace the built-in header system of the HTTP specification, nor does it support the idea that a new URI should be added only when a new resource or concept is introduced–not representation changes. Another argument against it is that resource URIs aren’t meant to change over time. A resource is a resource.

The Question

In the spirit of agile development, I have a question for you… is a discussion on how to version even relevant?  Can we claim YAGNI (“You Aren’t Going-ta Need It”) and encourage people to version as late as possible or not at all?  Before you click the ‘back’ button, hear me out…

The Proposal

I’ve been hanging out with a bunch of smart people at API Craft in Detroit this week.  A lot of passionate folks are talking about versioning and coming to this conclusion:

Evolve Instead of Version!

With many of the techniques and technologies we’re using today, maintaining backwards compatibility is more possible than ever.  With a little forethought and planning, we can leverage these to create RESTful APIs that don’t require versioning–maintaining backwards compatibility.

Use of JSON gets us a long way there as most JSON-parsing libraries support the concept of new properties in responses not causing parse issues for clients. So as long as we don’t change the semantics of existing properties or remove existing properties, our consumers shouldn’t break.

Additionally, by leveraging a linking strategy, clients can eliminate the use of their own hard-coded links in favor of rendering widgets on the UI based on the links in your response payload.  For example, if your UI is displaying user details, the JSON representation would contain user properties.  In addition, the UI can decide whether to render an ‘Edit’ button based on whether an ‘edit’ link exists in your returned response.  Also, the button would leverage the URL exposed by that ‘edit’ link instead of relying on its own hard-coded URL value to perform the edit.  The same is true for ‘pagination’ links, supporting ‘first’, ‘last’, ‘next’, and ‘previous’ operations on large collections of data.

While possibly harder to create a client of this nature, the style is a lot more resilient and dynamic, relying on the underlying response in a way that enables changes to the underlying service without breaking the client.

The Stop-Gap (or “Plan B”)

Inevitably there will come a time when an API requires a change to its returned or expected representation that will cause consumers to break and that breaking change must be avoided. Versioning your API is the way to avoid breaking your clients and consumers.  And, as mentioned above, the URI should be simply to identify the resource–not its ‘shape’.  Therefore, another concept must be used to specify the format of the response (representation).

That “other concept” is a pair of HTTP headers: Accept and Content-Type.  The Accept header allows clients to specify the media type (or types) of the response they desire or can support. The Content-Type header is used by both clients and servers to indicate the format of the request or response body, respectively.

For example, to retrieve a user in JSON format:# RequestGET http://api.example.com/users/12345Accept: application/json; version=1

# Response
HTTP/1.1 200 OK
Content-Type: application/json; version=1

{“id”:”12345”, “name”:”Joe DiMaggio”}

Now, to retrieve version 2 of that same resource in JSON format:

# Request
GET http://api.example.com/users/12345
Accept: application/json; version=2

# Response
HTTP/1.1 200 OK
Content-Type: application/json; version=2

{“id”:”12345”, “firstName”:”Joe”, “lastName”:”DiMaggio”}

Notice how the URI is the same for both versions as it identifies the resource, with the Accept header being used to indicate the format (and version in this case) of the desired response. Alternatively, if the client desired an XML formatted response, the Accept header would be set to ‘application/xml’ instead, with a version specified, if needed.

What version is returned when no version is specified?

There are basically three forms of thought on this…

  • Return the latest (the most recent) version by default.
  • Return the earliest supported version (the oldest supported version) by default.
  • Require a version specified in the request and return an error if not specified.

The first option, returning the latest version by default has the most risk for breaking clients, as releasing a new version instantly introduces breaking changes to clients without them knowing it.  This could be a bad thing.

The second option, returning the oldest supported version also has the potential to break clients, though more rarely.  The potential only exists when an older version gets deprecated and removed from service–when the oldest supported version gets dropped and a new version is now the oldest supported version.

The final option, requiring a version specified in the request, makes things explicit but has the downside of making every request more complex.

For my use cases, I favor option #2.  Perhaps it’s a case-by-case discussion, but I highly recommend being consistent.  Pick one and stick with it.

Summary

Straight-up, versioning is hard, arduous, difficult, fraught with heartache, even pain and extreme sadness–let’s just say it adds a lot of complexity to an API and possibly to the clients that access it. Consequently, be deliberate in your API design and make efforts to not need versioned representations.

Favor not versioning, instead of using versioning as a crutch for poor API design. You’ll hate yourself in the morning if you need to version your APIs at all, let alone frequently. Lean on the idea that with the advent of JSON usage for representations, clients can be tolerant to new properties appearing in a response without breaking.  If you must version a representation, do it as late as possible and use the Accept and Content-Type header combo to accomplish your goal.

What are your experiences and thoughts?  Submit a comment below!

Instant REST Services with RESTExpress Q&A

This is a follow-on to my last two posts, Introduction to REST (Revisited) and RESTExpress Overview and Tutorial, with this short video being the Q&A after the presentation.  In it, the video talks about authentication, authorization, and some of the RESTExpress features around sorting, filtering and performance.  It’s a quickie, but it’s always nice to understand what others are asking… and some answers to those questions.

[youtube http://youtu.be/z5u4rZTK8o0]

RESTExpress Overview and Tutorial

In my last post, Intro to REST (Revisited), as well as discussing the six constraints of the REST architectural style, the video discussed the background for the Java Rest Service Framework, RESTExpress and introduces a sample project.  The video below is part two of the presentation, where we dive in and create a real, working service suite using RESTExpress around a blogging system that uses MongoDB for its back-end store.  The reference implementation demo’d during the presentation is available on GitHub and supports linking, pagination, filtering and sorting of collection results returned (for blogs, entries, and comments).

This video goes into a bit of depth on how all that gets accomplished in your service suites with a minimum of coding.

[youtube http://youtu.be/hHDO6soGehc]

 

If you haven’t heard, RESTExpress is a lightweight micro-framework (along with some other micro-frameworks) that support rapid development of highly scalable, high performance REST services supporting JSON and XML payloads.  It is an active open source project that is gaining momentum.  You can get more information at the resources below:

Introduction to REST (Revisited)

I get a lot of questions around RESTful practices and I see a lot of confusion out there in REST API land about what REST really means, is, and how to implement it.  Last November 2012, I talked at the Pearson Technology Summit in Denver, Colorado.  The talk is entitled “Instant REST Services with RESTExpress” but spends some time during the first have to revisit the six REST architectural constraints.  While the lapel mic was giving me troubles and causing some noise on the recording, the content is definitely understandable.  Enjoy.

[youtube http://youtu.be/XcNDRr5zaI0]

 

The Six Constraints of the RESTful Architectural Style are:

  • Uniform Interface
  • Stateless
  • Client-Server
  • Layered System
  • Cacheable
  • Code on Demand