ToddFredrich.com RESTful/Platform APIs and Examples, Tutorials and the Software Craft. Mostly…

14Aug/135

What to Use as Identifiers (IDs) in REST APIs

identifiers-everyones-a-numberTIP: Publicly exposed identifiers (IDs), such as those exposed in your RESTful URLs, should not expose underlying technology. And in most cases, should not contain business meaning.

For a long time it's been good practice to ensure that primary keys in your database tables do not contain business semantics so that it doesn't change when the business meaning changes.  We've all seen the cases where either SSN, phone number, or email address utilized as a primary key turns out to be a bad choice.

If you're familiar with this practice, then it's not news to you that IDs exposed in REST APIs generally have the same rules--a URL is supposed to uniquely identify a resource and not change over time.

The Problem

The catch is that often we end up exposing our underlying technology in those identifiers. Consider the case where we have a URL like /users/12345, where the 12345 is a user identifier.  Is that the underlying database column, perhaps MySQL, where the user_id column is a long or AUTOSEQUENCE?  Problemo when you reimplement your user resources in MongoDB and decide to use the MongoDB ObjectID which looks something like 4e20885deabfa3a2586b5fb1.

Even more subtle is when we expose a mixture of some MySQL tables, MongoDB documents, Cassandra keyspaces, or Redis objects and expose their IDs in URLs. Your RESTful APIs can look quite cluttered and be confusing to your consumers.

The Proposal

Nearly all databases, whether NoSQL or relational, now support the concept of a UUID (universally-unique identifier).  It is a 16-byte (128-bit) binary representation, when displayed as a string is 32 hexadecimal digits, displayed in five groups separated by hyphens for a total of 36 characters (32 alphanumeric characters plus four hyphens). For example (see the wikipedia link above for more detail):

550e8400-e29b-41d4-a716-446655440000

While those 36 characters are cumbersome to type, using a Type 4 (random) UUID as primary keys, row keys, etc. in databases makes things appear much more cohesive where unique identifiers are exposed in your RESTful APIs.  And besides, most of the time the ID is simply used by a JavaScript or other consumer that doesn't care how long the IDs are in your URL.

One More Thing

Don't like all those characters and hyphens in your URL?  Well, you can Base64 encode your UUIDs before displaying them in URLs and Base64 decode them on the way back in--except that Base64 is not URL safe!  So you have to either URL encode/decode them or use a URL-safe Base64 encoder/decoder (like that available in RestExpress's own RepoExpress UuidConverter).  This will get your UUIDs down to 22 characters instead of the normal 36.

Recommendation: IDs generated by an API are in the form of web-safe, base64-encoded UUIDs, which are 22 characters in length. For example, "abcdEFh4520juieUKHWgJQ" instead of "550e8400-e29b-41d4-a716-446655440000."

Feedback?  What are your thoughts and experiences.  Please offer your comments below...

Share and Enjoy:
  • Facebook
  • Twitter
  • LinkedIn
  • Digg
  • StumbleUpon
  • del.icio.us
  • Ping.fm
  • Reddit
  • Add to favorites
  • email
  • Tumblr
  • Google Bookmarks
Comments (5) Trackbacks (0)
  1. But using database ID’s in Rest API URI can’t be considered a security issue right?

    If I control that only authenticated users (by an API key for example) have permission to CRUD that resource, using database ID’s in the URI is no problem right?

    Thanks

  2. Thanks for the great ideas!

  3. That’s a mighty big key and can impact the total size of returned data in lists. Also if using GUIDs in various databases, such as SQL Server, be ware of both the key size and sorting / indexing implications.

  4. @tiago It can be seen as security issue unless you never do mistakes. Anyway it is more preventive good practice helping to manage risks in case of developer mistakes.

  5. P.S. my comment above applies only for database IDs which do not satisfy these requirements: unique, unguessable. Typical example of bad ID – integer autoincrement columns used as PK in RDBMS tables – they are unique, but guessable. So it is better to avoid using them for resource identification in public communication (communication between frontend and backend is public communication also).


Leave a comment

No trackbacks yet.