How To Process Java Annotations

One of the cool new features of Java 8 is the support for lambda expressions. Lambda expressions lean heavily on the FunctionalInterface annotation.

In this post, we’ll look at annotations and how to process them so you can implement your own cool features.

Annotations

Annotations were added in Java 5. The Java language comes with some predefined annotations, but you can also define custom annotations.

Many frameworks and libraries make good use of custom annotations. JAX-RS, for instance, uses them to turn POJOs into REST resources.

Annotations can be processed at compile time or at runtime (or even both).

At runtime, you can use the reflection API. Each element of the Java language that can be annotated, like class or method, implements the AnnotatedElement interface. Note that an annotation is only available at runtime if it has the RUNTIME RetentionPolicy.

Compile-Time Annotation Processing

Java 5 came with the separate apt tool to process annotations, but since Java 6 this functionality is integrated into the compiler.

You can either call the compiler directly, e.g. from the command line, or indirectly, from your program.

In the former case, you specify the -processor option to javac, or you use the ServiceLoader framework by adding the file META-INF/services/javax.annotation.processing.Processor to your jar. The contents of this file should be a single line containing the fully qualified name of your processor class.

The ServiceLoader approach is especially convenient in an automated build, since all you have to do is put the annotation processor on the classpath during compilation, which build tools like Maven or Gradle will do for you.

Compile-Time Annotation Processing From Within Your Application

You can also use the compile-time tools to process annotations from within your running application.

Rather than calling javac directly, use the more convenient JavaCompiler interface. Either way, you’ll need to run your application with a JDK rather than just a JRE.

The JavaCompiler interface gives you programmatic access to the Java compiler. You can obtain an implementation of this interface using ToolProvider.getSystemJavaCompiler(). This method is sensitive to the JAVA_HOME environment variable.

The getTask() method of JavaCompiler allows you to add your annotation processor instances. This is the only way to control the construction of annotation processors; all other methods of invoking annotation processors require the processor to have a public no-arg constructor.

Annotation Processors

A processor must implement the Processor interface. Usually you will want to extend the AbstractProcessor base class rather than implement the interface from scratch.

Each annotation processor must indicate the types of annotations it is interested in through the getSupportedAnnotationTypes() method. You may return * to process all annotations.

The other important thing is to indicate which Java language version you support. Override the getSupportedSourceVersion() method and return one of the RELEASE_x constants.

With these methods implemented, your annotation processor is ready to get to work. The meat of the processor is in the process() method.

When process() returns true, the annotations processed are claimed by this processor, and will not be offered to other processors. Normally, you should play nice with other processors and return false.

Elements and TypeMirrors

The annotations and the Java elements they are present on are provided to your process() method as Element objects. You may want to process them using the Visitor pattern.

The most interesting types of elements are TypeElement for classes and interfaces (including annotations), ExecutableElement for methods, and VariableElement for fields.

Each Element points to a TypeMirror, which represents a type in the Java programming language. You can use the TypeMirror to walk the class relationships of the annotated code you’re processing, much like you would using reflection on the code running in the JVM.

Processing Rounds

Annotation processing happens in separate stages, called rounds. During each round, a processor gets a chance to process the annotations it is interested in.

The annotations to process and the elements they are present on are available via the RoundEnvironment parameter passed into the process() method.

If annotation processors generate new source or class files during a round, then the compiler will make those available for processing in the next round. This continues until no more new files are generated.

The last round contains no input, and is thus a good opportunity to release any resources the processor may have acquired.

Initializing and Configuring Processors

Annotation processors are initialized with a ProcessingEnvironment. This processing environment allows you to create new source or class files.

It also provides access to configuration in the form of options. Options are key-value pairs that you can supply on the command line to javac using the -A option. For this to work, you must return the options’ keys in the processor’s getSupportedOptions() method.

Finally, the processing environment provides some support routines (e.g. to get the JavaDoc for an element, or to get the direct super types of a type) that come in handy during processing.

Classpath Issues

To get the most accurate information during annotation processing, you must make sure that all imported classes are on the classpath, because classes that refer to types that are not available may have incomplete or altogether missing information.

When processing large numbers of annotated classes, this may lead to a problem on Windows systems where the command line becomes too large (> 8K). Even when you use the JavaCompiler interface, it still calls javac behind the scenes.

The Java compiler has a nice solution to this problem: you can use argument files that contain the arguments to javac. The name of the argument file is then supplied on the command line, preceded by @.

Unfortunately, the JavaCompiler.getTask() method doesn’t support argument files, so you’ll have to use the underlying run() method.

Remember that the getTask() approach is the only one that allows you to construct your annotation processors. If you must use argument files, then you have to use a public no-arg constructor.

If you’re in that situation, and you have multiple annotation processors that need to share a single instance of a class, you can’t pass that instance into the constructor, so you’ll be forced to use something like the Singleton pattern.

Conclusion

Annotations are an exciting technology that have lots of interesting applications. For example, I used them to extract the resources from a REST API into a resource model for further processing, like generating documentation.

I’m very interested to learn what you have used them for. Please leave a comment below.

Advertisement

The State of REST

rest-easyThe S in REST stands for State. Unfortunately, state is an overloaded word.

In this post I’ll discuss the two different kinds of state that apply to REST APIs.

Applications

The first type of state is application state, as in Hypermedia As The Engine Of Application State (HATEOAS), the distinguishing feature of the REST architectural style.

We must first understand what exactly an application in a RESTful architecture is and isn’t. A REST API is not an interface to an application, but an interface for an application.

A RESTful service is just a bunch of interrelated and interconnected resources. In and of themselves they don’t make up an application. It’s a particular usage pattern of those resources that turn them into an application.

The term application pertains more to the client than to the server. It’s possible to build different applications on top of the same resources. It’s also possible to build an application out of resources hosted on different servers or even by different organizations (mashups).

Application State vs. Resource State

The stateless constraint of REST doesn’t say that a server can’t maintain any state, it only says that a server shouldn’t maintain application (or session) state.

A server is allowed to maintain other state. In fact, most servers would be completely useless if they didn’t. Think of the Amazon web store without any books! We call this resource state to distinguish it from application state.

So where do we draw the line between resource state and application state?

Resource state is information we want to be available between multiple sessions of the same user, and between sessions of different users. Resource state can initially be supplied by either servers (e.g. books) or clients (e.g. book reviews).

Application state is the information that pertains to one particular session of the application. The contents of my shopping cart could be application state, for instance.

Note that this is not how Amazon implemented it; they keep this state on the server. That doesn’t mean that the people at Amazon don’t understand REST. The web browser that I use to shop isn’t sophisticated enough to maintain the application state. Also, they want me to be able to close my browser and return to my shopping cart tomorrow.

This example shows that what is application and what resource state is a design decision.

Application state pertains to the goal the user is trying to achieve while driving the client. It is this state that we’re referring to when we talk about state diagrams for REST APIs, not the resource state.

State Transfer

Application state is all the information maintained on the client side while the user is trying to accomplish a goal. This information is built up piece by piece based on the resource state that is transferred between client and server.

The resource state is transferred as a representation, a particular serialization of the resource state suitable for inclusion in an HTTP message body.

Serialization is governed by the rules laid out in a media type. There are many different media types, some more mature than others.

Since clients and servers transfer representations of resource state, we speak of Representational State Transfer (REST).

How To Return Error Details From REST APIs

errorThe HTTP protocol uses status codes to return error information. This facility, while extremely useful, is too limited for many use cases. So how do we return more detailed information?

There are basically two approaches we can take:

  1. Use a dedicated media type that contains the error details
  2. Include the error details in the used media type

Dedicated Error Media Types

There are at least two candidates in this space:

  1. Problem Details for HTTP APIs is an IETF draft that we’ve used in some of our APIs to good effect. It treats both problem types and instances as URIs and is extensible. This media type is available in both JSON and XML flavors.
  2. application/vnd.error+json is for JSON media types only. It also feels less complete and mature to me.

A media type dedicated to error reporting has the advantage of being reusable in many REST APIs. Any existing libraries for handling them could save us some effort. We also don’t have to think about how to structure the error details, as someone else has done all that hard work for us.

However, putting error details in a dedicated media type also increases the burden on clients, since they now have to handle an additional media type.

Another disadvantage has to do with the Accept header. It’s highly unlikely that clients will specify the error media type in Accept. In general, we should either return 406 or ignore the Accept header when we can’t honor it. The first option is not acceptable (pun intended), the second is not very elegant.

Including Error Details In Regular Media Type

We could also design our media types such that they allow specifying error details. In this post, I’m going to stick with three mature media types: Mason, Siren, and UBER.

Mason uses the @error property:

{
  "@error": {
    "@id": "b2613385-a3b2-47b7-b336-a85ac405bc66",
    "@message": "There was a problem with one or more input values.",
    "@code": "INVALIDINPUT"
  }
}

The existing error properties are compatible with Problem Details for HTTP APIs, and they can be extended.

Siren doesn’t have a specific structure for errors, but we can easily model errors with the existing structures:

{
  "class": [
    "error"
  ],
  "properties": {
    "id": "b2613385-a3b2-47b7-b336-a85ac405bc66",
    "message": "There was a problem with one or more input values.",
    "code": "INVALIDINPUT"
  }
}

We can even go a step further and use entities to add detailed sub-errors. This would be very useful for validation errors, for instance, where you can report all errors to the client at once. We could also use the existing actions property to include a retry action. And we could use the error properties from Problem Details for HTTP APIs.

UBER has an explicit error structure:

{
  "uber": {
    "version": "1.0",
    "error": {
      "data": [
        { "name": "id", "value": "b2613385-a3b2-47b7-b336-a85ac405bc66" },
        { "name": "message", "value": "There was a problem with one or more input values." },
        { "name": "code", "value": "INVALIDINPUT" }
      ]
    }
  }
}

Again, we could reuse the error properties from Problem Details for HTTP APIs.

Conclusion

My advice would be to use the error structure of your existing media type and use the extensibility features to steal all the good ideas from Problem Details for HTTP APIs.

How To Design a REST API

rest-easyThere is a lot of interest in REST APIs these days. Unfortunately, most APIs I see are not very mature.

In this post I’d like to share my approach to designing REST APIs:

  1. Understand the problem domain and application requirements and document them as a state diagram
  2. Discover the resources from the transitions
  3. Name the resources with URIs
  4. Select one or more media types to serialize the various representations identified in the resource model
  5. Assign link relations to each of the transitions
  6. Add documentation as required

Note that this is a variation of the design process described in RESTful Web Services. Now, let’s look at each of the steps in detail.

Step 1: Understand the problem domain and application requirements and document them as a state diagram

Without understanding the domain, it’s impossible to come up with a good design for anything.

We want to capture the domain in a way that makes sense for REST APIs. Since the purpose in life for a REST API is to be consumed by REST clients, it makes sense to document the application domain from the point of view of a REST client.

A REST client starts at some well-known URI (the billboard URI, or the URI of the home resource) and then follows links until its goal is met:
restClientFlow
In other words, a REST client starts at some initial state, and then transitions to other states by following links (i.e. executing HTTP methods against URIs) that are discovered from the previously returned representation.

A natural way to capture this information is in the form of a state diagram. For bonus points, we can turn the requirements into executable specifications using BDD techniques and derive the state diagram from the BDD scenarios.

For each scenario, we should specify the happy path, any applicable alternative paths and also the sad paths (edge, error, and abuse cases). We can do this iteratively, starting with only the happy path and then adding progressively more detail based on alternative and sad paths.

We can first collect all scenarios and build the entire state diagram from there. Alternatively, we can start with a select few scenarios, work through the design process, and then repeat everything with new scenarios.

In other words, we can do waterfall-like Big Analysis/Design Up Front, or work through feature by feature in a more Agile manner.

Either way, we document the requirements using a state diagram and work from there.

Step 2: Discover the resources from the transitions

You can build up the resource model piece-by-piece:
transitions

  1. Start with the initial state
  2. Create (or re-use) a resource with a representation that corresponds to this state
  3. For each transition starting from the current state, make sure there is a corresponding method in some resource that implements the transition
  4. Repeat for all transitions in each of the remaining states

Step 3: Name the resources with URIs

Every resource should be identified by a URI. From the client’s perspective, this is an implementation detail, but we still need to do this before we can implement the server.

We should follow best practices for URIs, like keeping them cool.

Step 4: Select one or more media types to serialize the various representations identified in the resource model

mediaWhen extending an existing design, you should stick with the already selected media types.

For new APIs, we should choose a mature format, like Siren or Mason.

There could be specific circumstances where these are not good choices. In that case, carefully select an alternative.

Step 5: Assign link relations to each of the transitions

A REST client follows transitions in the state diagram by discovering links in representations. This discovery process is made possible by link relations.

Link relations decouple the client from the URIs that the server uses, giving the server the freedom to change its URI structure at will without breaking any clients. Link relations are therefore an important part of any REST API.

We should try to use existing link relations as much as possible. They don’t cover every case, however, so sometimes you need to invent your own.

Step 6: Add documentation as required

learn moreIn order to help developers build clients that work against your API, you will most likely want to add some documentation that explains certain more subtle points.

Examples are very helpful to illustrate those points.

You may also add instructions for server developers that will implement the API, like what caching to use.

Conclusion

I’ve successfully used this approach on a number of APIs. Next time, I’ll show you with an example how the above process is actually very easy with the right support.

In the meantime, I’d love to hear how you approach REST API design. Please leave a comment below.

How To Control Access To REST APIs

hackerExposing your data or application through a REST API is a wonderful way to reach a wide audience.

The downside of a wide audience, however, is that it’s not just the good guys who come looking.

Securing REST APIs

Security consists of three factors:

  1. Confidentiality
  2. Integrity
  3. Availability

In terms of Microsoft’s STRIDE approach, the security compromises we want to avoid with each of these are Information Disclosure, Tampering, and Denial of Service. The remainder of this post will only focus on Confidentiality and Integrity.

In the context of an HTTP-based API, Information Disclosure is applicable for GET methods and any other methods that return information. Tampering is applicable for PUT, POST, and DELETE.

Threat Modeling REST APIs

A good way to think about security is by looking at all the data flows. That’s why threat modeling usually starts with a Data Flow Diagram (DFD). In the context of a REST API, a close approximation to the DFD is the state diagram. For proper access control, we need to secure all the transitions.

The traditional way to do that, is to specify restrictions at the level of URI and HTTP method. For instance, this is the approach that Spring Security takes. The problem with this approach, however, is that both the method and the URI are implementation choices.

link-relationURIs shouldn’t be known to anybody but the API designer/developer; the client will discover them through link relations.

Even the HTTP methods can be hidden until runtime with mature media types like Mason or Siren. This is great for decoupling the client and server, but now we have to specify our security constraints in terms of implementation details! This means only the developers can specify the access control policy.

That, of course, flies in the face of best security practices, where the access control policy is externalized from the code (so it can be reused across applications) and specified by a security officer rather than a developer. So how do we satisfy both requirements?

Authorizing REST APIs

I think the answer lies in the state diagram underlying the REST API. Remember, we want to authorize all transitions. Yes, a transition in an HTTP-based API is implemented using an HTTP method on a URI. But in REST, we shield the URI using a link relation. The link relation is very closely related to the type of action you want to perform.

The same link relation can be used from different states, so the link relation can’t be the whole answer. We also need the state, which is based on the representation returned by the REST server. This representation usually contains a set of properties and a set of links. We’ve got the links covered with the link relations, but we also need the properties.

PolicyIn XACML terms, the link relation indicates the action to be performed, while the properties correspond to resource attributes.

Add to that the subject attributes obtained through the authentication process, and you have all the ingredients for making an XACML request!

There are two places where such access control checks comes into play. The first is obviously when receiving a request.

You should also check permissions on any links you want to put in the response. The links that the requester is not allowed to follow, should be omitted from the response, so that the client can faithfully present the next choices to the user.

Using XACML For Authorizing REST APIs

I think the above shows that REST and XACML are a natural fit.

All the more reason to check out XACML if you haven’t already, especially XACML’s REST Profile and the forthcoming JSON Profile.

Behavior-Driven RESTful APIs

In the RESTBucks example, the authors present a useful state diagram that describes the actions a client can perform against the service.

Where does such an application state diagram come from? Well, it’s derived from the requirements, of course.

Since I like to specify requirements using examples, let’s see how we can derive an application state diagram from BDD-style requirements.

Example: RESTBucks state diagram

Here are the three scenarios for the Order a Drink story:

Scenario: Order a drink

Given the RESTBucks service
When I create an order for a large, semi milk latte for takeaway
Then the order is created
When I pay the order using credit card xxx1234
Then I receive a receipt
And the order is paid
When I wait until the order is ready
And I take the order
Then the order is completed

Scenario: Change an order

Given the RESTBucks service
When I create an order for a large, semi milk latte for takeaway
Then the order is created
And the size is large
When I change the order to a small size
Then the order is created
And the size is small

Scenario: Cancel an order

Given the RESTBucks service
When I create an order for a large, semi milk latte for takeaway
Then the order is created
When I cancel the order
Then the order is canceled

Let’s look at this in more detail, starting with the happy path scenario.

Given the RESTBucks service
When I create an order for a large, semi milk latte for takeaway

The first line tells me there is a REST service, at some given billboard URL. The second line tells me I can use the POST method on that URI to create an Order resource with the given properties.
bdd-rest-1

Then the order is created

This tells me the POST returns 201 with the location of the created Order resource.

When I pay the order using credit card xxx1234

This tells me there is a pay action (link relation).
bdd-rest-2

Then I receive a receipt

This tells me the response of the pay action contains the representation of a Receipt resource.
bdd-rest-3

And the order is paid

This tells me there is a link from the Receipt resource back to the Order resource. It also tells me the Order is now in paid status.
bdd-rest-4

When I wait until the order is ready

This tells me that I can refresh the Order using GET until some other process changes its state to ready.
bdd-rest-5

And I take the order

This tells me there is a take action (link relation).
bdd-rest-6

Then the order is completed

This tells me that the Order is now in completed state.
bdd-rest-7

Analyzing the other two scenarios in similar fashion gives us a state diagram that is very similar to the original in the RESTBucks example.
bdd-rest-8
The only difference is that this diagram here contains an additional action to navigate from the Receipt to the Order. This navigation is also described in the book, but not shown in the diagram in the book.

Using BDD techniques for developing RESTful APIs

Using BDD scenarios it’s quite easy to discover the application state diagram. This shouldn’t come as a surprise, since the Given/When/Then syntax of BDD scenarios is just another way of describing states and state transitions.

From the application state diagram it’s only a small step to the complete resource model. When the resource model is implemented, you can re-use the BDD scenarios to automatically verify that the implementation matches the requirements.

So all in all, BDD techniques can help us a lot when developing RESTful APIs.

HyperRosetta

rosetta-stoneThe Rosetta stone is a rock with the same text inscribed in three different languages. This allowed us to decipher Egyptian hieroglyphs.

In this post I’ll introduce a similar “stone” for hypermedia formats, using the RESTBucks example.

I think that seeing concrete hypermedia messages in different formats will make the similarities and differences clearly visible, which will hopefully make it easier to choose the right format for your API.

The text that I’ll use for HyperRosetta, the hypermedia Rosetta stone, is Example 5.6 of the REST in Practice book. For those hypermedia formats that can include templates, I’ll use the payment link with the representation in Example 5.7 of the book.

These are the media types available on HyperRosetta:

If you’d like to see another media type added to this list, please add a comment to this post.

application/vnd.restbucks+xml

This is the representation used in the book:

<order xmlns="http://schemas.restbucks.com/" 
    xmlns:dap="http://schemas.restbucks.com/dap">
  <dap:link mediaType="application/vnd.restbucks.com"
      uri="http://restbucks.com/order/1234"
      rel"http://relations.restbucks.com/cancel"/>
  <dap:link mediaType="application/vnd.restbucks.com"
      uri="http://restbucks.com/payment/1234"
      rel"http://relations.restbucks.com/payment"/>
  <dap:link mediaType="application/vnd.restbucks.com"
      uri="http://restbucks.com/order/1234"
      rel"http://relations.restbucks.com/update"/>
  <dap:link mediaType="application/vnd.restbucks.com"
      uri="http://restbucks.com/order/1234"
      rel"self"/>
  <item>
    <milk>semi</milk>
    <size>large</size>
    <drink>cappuccino</drink>
  </item>
  <location>takeAway</location>
  <cost>2.0</cost>
  <status>unpaid</status
</order>

This is an example of a domain-specific media type, so I will not be able to re-use existing client libraries to parse it. Since this format is based on XML, I can use an existing XML parser, but it won’t be able to recognize links in this representation.

This media type doesn’t tell me in the message what HTTP methods to use to dereference the link URIs. So the client has to be programmed with the knowledge that it has to use PUT on the update link, for instance. If the service should ever change that to PATCH, the client will break.

The message does tell me the media type to expect in the response, making it easier for the client to decide whether it should follow a link.

The RESTBucks media type is a fiat standard that isn’t registered with IANA.

application/atom+xml

<entry xmlns="http://www.w3.org/2005/Atom">
  <author>
    <name>RESTBucks</name>
  </author>
  <content type="application/vnd.restbucks+xml">
    <order xmlns="http://schemas.restbucks.com/">
      <item>
        <milk>semi</milk>
        <size>large</size>
        <drink>cappuccino</drink>
      </item>
      <location>takeAway</location>
      <cost>2.0</cost>
      <status>unpaid</status
    </order>
  </content>
  <link type="application/vnd.restbucks.com"
      href="http://restbucks.com/payment/1234"
      rel"http://relations.restbucks.com/payment"/>
  <link type="application/vnd.restbucks.com"
      href="http://restbucks.com/order/1234"
      rel"edit"/>
  <link type="application/vnd.restbucks.com"
      href="http://restbucks.com/order/1234"
      rel"self"/>
</entry>

Atom is a domain-general media type that implements the collection pattern. Since it is not specific to a single service, I can use an Atom library to parse the message and it will be able to find links. It will also know that the edit link will allow me to update and delete the order (using PUT and DELETE respectively).

The Atom library won’t help me with parsing the content, but I can still use an XML library for that. As with the domain-specific media type, I will need to embed knowledge of the HTTP method used for paying into my client.

A disadvantage of this particular domain-general media type is the overhead of things like author that don’t particularly make sense for RESTBucks.

application/xml

<order xmlns="http://schemas.restbucks.com/">
  <link href="http://restbucks.com/order/1234"
      rel"http://relations.restbucks.com/cancel"/>
  <link href="http://restbucks.com/payment/1234"
      rel"http://relations.restbucks.com/payment"/>
  <link href="http://restbucks.com/order/1234"
      rel"http://relations.restbucks.com/update"/>
  <link href="http://restbucks.com/order/1234"
      rel"self"/>
  <item>
    <milk>semi</milk>
    <size>large</size>
    <drink>cappuccino</drink>
  </item>
  <location>takeAway</location>
  <cost>2.0</cost>
  <status>unpaid</status
</order>

XML is a domain-agnostic format, which means it can be used for anything. The downside is that you won’t know anything about the content until you look at it.

Formally, this means I can’t dispatch on the media type alone anymore, but need to look at the namespace inside the message. In practice people dispatch on the media type anyway. This can lead to parsing problems when the expectation isn’t met, e.g. when an error document is sent instead of an order.

Worse, XML isn’t even a hypermedia type, since it doesn’t describe links. There is the XLInk standard for that, but I haven’t seen that used in APIs.

application/json

{ "order": {
  "_links": {
    "http://relations.restbucks.com/cancel": { 
      "href": "http://restbucks.com/order/1234"
    },
    "http://relations.restbucks.com/payment": {
      "href": "http://restbucks.com/payment/1234"
    }
    "http://relations.restbucks.com/update": { 
      "href": "http://restbucks.com/order/1234"
    },
    "self": { 
      "href": "http://restbucks.com/order/1234"
    },
    "item": {
      "milk": "semi",
      "size": "large",
      "drink": "cappuccino"
    }
    "location": "takeAway",
    "cost": 2.0,
    "status": "unpaid"
  }
}

JSON is a another domain-agnostic format without linking capabilities, so the same caveats apply as for XML. Despite these significant drawbacks, this is what high-profile projects like Spring use.

application/hal+json

{ "_links": {
    "self": { 
      "href": "http://restbucks.com/order/1234"
    },
    "curies": [{
      "name": "relations",
      "href": "http://relations.restbucks.com/"
    }],
    "relations:cancel": { 
      "href": "http://restbucks.com/order/1234"
    },
    "relations:payment": {
      "href": "http://restbucks.com/payment/1234"
    }
    "relations:update": { 
      "href": "http://restbucks.com/order/1234"
    }
  },
  "_embedded": {
    "item": [{
      "milk": "semi",
      "size": "large",
      "drink": "cappuccino"
    }]
  },
  "location": "takeAway",
  "cost": 2.0,
  "status": "unpaid"
}

HAL is a another domain-agnostic format, so the same caveats apply as for JSON.

However, HAL is a real hypermedia type, since it standardizes how to find links (using the _links property). In addition, it also defines how to find embedded objects (using the _embedded property).

HAL uses curies to make the representation a bit more compact. This is nice for humans, but another thing to program into your clients.

HAL is currently an Internet-Draft. There is also an XML version of HAL registered.

application/collection+json

{  "collection": {
    "version" : "1.0",
    "href" : "http://restbucks.com/orders",
    "items" : [{
      "href": "http://restbucks.com/order/1234",
      "data": [
        { "name": "item1.milk", "value": "semi" }, 
        { "name": "item1.size", "value": "large" }, 
        { "name": "item1.drink", "value": "cappuccino" }, 
        { "name": "location", "value": "takeAway" }, 
        { "name": "cost", "value": 2.0 }, 
        { "name": "status", "value": "unpaid" }
      ],
      "links" : [{ 
        "rel" : "http://relations.restbucks.com/cancel", 
        "href" : "http://restbucks.com/order/1234"
      }, { 
        "rel" : "http://relations.restbucks.com/payment", 
        "href" : "http://restbucks.com/payment/1234"
      }, { 
        "rel" : "http://relations.restbucks.com/update", 
        "href" : "http://restbucks.com/order/1234"
      }, { 
        "rel" : "self", 
        "href" : "http://restbucks.com/order/1234"
      }]
    }]
  }
}

Collection+JSON (Cj) is the translation of Atom into JSON. Like Atom, it’s a domain-general format for collections.

In Cj, properties are single values only, so I had to cheat and use the item1. prefix to keep the item data in the message. A better solution would probably be to extract the order items into its own collection, but I didn’t want to change the granularity of the messages.

Cj provides a template property that specifies how to add an item to the collection or update an existing one. Unfortunately, that mechanism is specific to collection actions, so we can’t use it to specify how to add a payment, for example. I don’t show the template in this example because it suffers from the same problem as the data property in that it can’t embed objects.

application/vnd.mason+json

{ "@namespaces": {
    "relations": {
      "name": "http://relations.restbucks.com/"
    }
  },
  "item": [{
    "milk": "semi",
    "size": "large",
    "drink": "cappuccino"
  }],
  "location": "takeAway",
  "cost": 2.0,
  "status": "unpaid",
  "@links": {
    "self": { 
      "href": "http://restbucks.com/order/1234"
    }
  },
  "@actions": {
    "relations:cancel": { 
      "href": "http://restbucks.com/order/1234",
      "type": "void",
      "method": "DELETE"
    },
    "relations:payment": {
      "href": "http://restbucks.com/payment/1234",
      "title": "Pay the order",
      "type": "any",
      "method": "PUT",
      "template": {
        "payment": {
          "amount": 2.0,
          "cardholderName": "",
          "cardNumber": "",
          "expiryMonth": "",
          "expiryYear": ""
        }
      }
    },
    "relations:update": { 
      "href": "http://restbucks.com/order/1234",
      "type": "any",
      "method": "PUT",
      "template": {
        "item": [{
          "milk": "semi",
          "size": "large",
          "drink": "cappuccino"
        }],
        "location": "takeAway",
        "cost": 2.0,
        "status": "unpaid"
      }
    }
  }
}

Mason is a superset of HAL. It adds actions and errors (not shown in this example), which turns it into a full hypermedia type (level 3b). Mason also supports curies.

A peculiarity of the actions is the type property, which can be void, json, json-files, or any. I’d expected a media type there.

application/vnd.siren+json

{ "class": [ "order" ],
  "properties": [{
    "location": "takeAway",
    "cost": 2.0,
    "status": "unpaid"
  },
  "entities": [{
    "class": [ "item" ],
    "rel": [ "http://relations.restbucks.com/item" ],
    "properties": {
      "milk": "semi",
      "size": "large",
      "drink": "cappuccino"
    }
  }],
  "actions": [{
    "name": "http://relations.restbucks.com/cancel",
    "title": "Cancel the order",
    "method": "DELETE",
    "href": "http://restbucks.com/order/1234"
  }, {
    "name": "http://relations.restbucks.com/payment",
    "title": "Pay the order",
    "href": "http://restbucks.com/payment/1234",
    "type": "application/vnd.siren+json",
    "method": "PUT",
    "fields": [{
      "name": "amount",
      "type": "number",
      "value": 2.0
    }, {
      "name": "cardholderName",
      "type": "text"
    }, {
      "name": "cardNumber",
      "type": "text"
    }, {
      "name": "expiryMonth",
      "type": "number"
    }, {
      "name": "expiryYear",
      "type": "number"
    }],
  }, {
    "name": "http://relations.restbucks.com/update",
    "href": "http://restbucks.com/order/1234",
    "method": "PUT",
    "type": "application/vnd.siren.json"
  }],
  "links": [{ 
    "rel": "self",
    "href": "http://restbucks.com/order/1234"
  }]
}

Siren is a full hypermedia type that includes the HTTP methods and media types to use. It also allows specifying the application semantics using the class, rel, and name properties. This makes it suitable for level 4 APIs.

Siren specifies the type for fields, so that clients can render appropriate UIs.

application/vnd.uber+json

{ "uber": {
  "version": "1.0",
  "data": [{
    "rel": [ "self" ],
    "url": "http://restbucks.com/order/1234"
  }, {
    "id": "order",
    "data": [{
      "name": "item",
      "data": [{
        "name": "milk",
        "value": "semi"
      }, {
        "name": "size", 
        "value": "large",
      }, {
        "name": "drink",
        "value": "cappuccino"
      }]
    }, {
      "name": "location",
      "value": "takeAway"
    }, {
      "name": "cost",
      "value": 2.0
    }, {
      "name": "status",
      "value": "unpaid"
    }]
  }, {
    "rel": "http://relations.restbucks.com/cancel",
    "url": "http://restbucks.com/payment/1234",
    "action": "remove"
  }, {
    "rel": "http://relations.restbucks.com/payment",
    "url": "http://restbucks.com/payment/1234",
    "sending": "application/vnd.uber+json",
    "action": "replace",
    "data": [{
      "name": "amount",
      "value": 2.0
    }, {
      "name": "cardholderName",
    }, {
      "name": "cardNumber",
    }, {
      "name": "expiryMonth",
    }, {
      "name": "expiryYear",
    }],
  }, {
    "rel": "http://relations.restbucks.com/update",
    "url": "http://restbucks.com/order/1234",
    "action": "replace",
    "sending": "application/vnd.uber.json"
  }]
}

UBER is a full hypermedia format with a lean message structure. It combines in the data property what other media types separate out in e.g. links, actions, and properties, which makes it a bit hard to read for humans.

UBER allows specifying application semantics using the name and rel properties, making it a level 4 capable media type.

There is also an XML variant of UBER.

RESTBucks Evolved

restbucksThe book REST in Practice: Hypermedia and Systems Architecture uses an imaginary StarBucks-like company as its running example.

I think this is a great example, since most people are familiar with the domain.

The design is also simple enough to follow, yet complex enough to be interesting.

Problem Domain

RESTbucks is about ordering and paying for coffee (or tea) and food. Here is the state diagram for the client:
restbucks-states

  1. Create the order
  2. Update the order
  3. Cancel the order
  4. Pay for the order
  5. Wait for the order to be prepared
  6. Take the order

Book Design

The hypermedia design in the book for the service is as follows:

  1. The client POSTs an order to the well-known RESTBucks URI. This returns the order URI in the Location header. The client then GETs the order URI
  2. The client POSTs an updated order to the order URI
  3. The client DELETEs the order URI
  4. The client PUTs a payment to the URI found by looking up a link with relation http://relations.restbucks.com/payment
  5. The client GETs the order URI until the state changes
  6. The client DELETEs the URI found by looking up a link with relation http://relations.restbucks.com/receipt

The book uses the specialized media type application/vnd.restbucks.order+xml for all messages exchanged.

Design Problems

Here are some of the problems that I have with the above approach:

  1. I think the well-known URI for the service (what Mike Amundsen calls the billboard URI) should respond to a GET, so that clients can safely explore it.
    This adds an extra message, but it also makes it possible to expand the service with additional functionality. For instance, when menus are added in a later chapter of the book, a second well-known URI is introduced. With a proper home document-like resource in front of the order service, this could have been limited to a new link relation.
  2. I’d rather use PUT for updating an order, since that is an idempotent method. The book states that the representation returned by GET contains links and argues that this implies that (1) PUT messages should also contain those links and (2) that that would be strange since those links are under control of the server.
    I disagree with both statements. A server doesn’t necessarily have to make the formats for GET and PUT exactly the same. Even if it did, some parts, like the links, could be optional. Furthermore, there is no reason the server couldn’t accept and ignore the links.
  3. The DELETE is fine.
    An alternative is to use PUT with a status of canceled, since we already have a status property anyway. That opens up some more possibilities, like re-instating a canceled order, but also introduces issues like garbage collection.
  4. I don’t think PUT is the correct method. Can the service really guarantee under all circumstances that my credit card won’t get charged twice if I repeat the payment?
    More importantly, this design assumes that payments are always for the whole order. That may seem logical at first, but once the book introduces vouchers that logic evaporates. If I have a voucher for a free coffee, then I still have to pay for anything to eat or for a second coffee.
    I’d create a collection of payments that the client should POST to. I’d also use the standard payment link relation defined in RFC 5988.
  5. This is fine.
  6. This makes no sense to me: taking the order is not the same as deleting the receipt. I need the receipt when I’m on a business trip, so I can get reimbursed!
    I’d rather PUT a new order with status taken.

Service Evolution

Suppose you’ve implemented your own service based on the design in the book.

evolutionFurther suppose that after reading the above, you want to change your service.

How can you do that without breaking any clients that may be out there?

After all, isn’t that what proponents tout as one of the advantages of a RESTful approach?

Well, yes and no. The media type defined in the book is at level 3a, and so will allow you to change URIs. However, the use of HTTP methods is defined out-of-band and you can’t easily change that.

Now imagine that the book would have used a full hypermedia type (level 3b) instead. In that case, the HTTP method used would be part of the message. The client would have to discover it, and the server could easily change it without fear of breaking clients.

Of course, this comes at the cost of having to build more logic into the client. That’s why I think it’s a good idea to use an existing full hypermedia type like Mason, Siren, or UBER. Such generic media types are much more likely to come with libraries that will handle this sort of work for the client.

REST Maturity

rest-maturity2In 2008, Leonard Richardson published his Maturity Heuristic that classified web services into three levels based on their use of URI, HTTP, and hypermedia.

Back then, most web services were stuck at either level 1 or 2. Unfortunately, not a whole lot has improved since then in that respect: so-called pragmatic REST is still the norm.

BTW, I really dislike the term “pragmatic REST”. It’s a cheap rhetoric trick to put opponents (“dogmatists”) on the defensive.

More importantly, it creates semantic diffusion: pragmatic REST is not actually REST according to the definition, so please don’t call it that way or else we’re going to have a hard time understanding each other. The term REST hardly means anything anymore these days.

Anyway, there is some light at the end of the tunnel: more services are now at level 3, where they serve hypermedia. A good example by a big name is Amazon’s AppStream API.

The difference between plain media types, like image/jpeg, and hypermedia types, like text/html, is of course the “hyper” part. Links allow a client to discover functionality without being coupled to the server’s URI structure.

JSONBTW, application/json is not a hypermedia type, since JSON doesn’t define links.

We can, of course, use a convention on top of JSON, for instance that there should be a links property with a certain structure to describes the links, like Spring HATEOAS does.

The problem with conventions is that they are out-of-band communication, and a client has no way of knowing for sure whether that convention is followed when it sees a Content-Type of application/json. It’s therefore much better to use a media type that turns the convention into a rule, like HAL does.

Speaking of out-of-band communication, the amount of it steadily decreases as we move up the levels. This is a very good thing, as it reduces the amount of coupling between clients and servers.

Level 3 isn’t really the end station, however. Even with a hypermedia format like HAL there is still a lot of out-of-band communication.

HALHAL doesn’t tell you which HTTP method to use on a particular link, for instance.

The client can only know because a human has programmed it with that knowledge, based on some human-readable description that was published somewhere.

Imagine that the human Web would work this way. We wouldn’t be able to use the same browser to shop at Amazon and read up at Wikipedia and do all those other things we take for granted. Instead, we would need an Amazon Browser, a Wikipedia Browser, etc. This is what we do with APIs today!

Moving further into the direction of less out-of-band communication requires more than just links. Links only specify the URI part and we also need the HTTP and media type parts inside our representations. We might call this level 3b, Full Hypermedia.

Siren gives you this. Uber even goes a step further and also abstracts the protocol, so that you can use it with, say, CoAP rather than HTTP.

These newer hypermedia types allow for the use of a generic client that can handle any REST API that serves that hypermedia type, just like a web browser can be used against anything that serves HTML. An example of such an effort is the HAL browser (even though HAL is stuck at level 3a).

However, even with the inclusion of protocol, media type, and method in the representation, we still need some out-of-band communication.

The HAL browser can navigate any API that serves HAL, but it doesn’t understand the responses it gets. Therefore it can’t navigate links on its own to reach a certain goal. For true machine-to-machine (M2M) communication, we still need more.

ALPSIf we ever get the whole semantic web sorted out, this might one day be the final answer, but I’m not holding my breath.

In the meantime we’ll have to settle for partial answers.

One piece of the puzzle could be to define application semantics using profiles, for instance in the ALPS format. We might call this level 4, Semantic Profile.

We’d still need a human to read out-of-band communication and build a special-purpose client for M2M scenarios. But this client could handle all services in the application domain it is programmed to understand, not just one.

Also, the human could be helped a lot by a generic API browser that fetches ALPS profiles to explain the API.

All this is currently far from a reality. But we can all work towards this vision by choosing generic, full-featured hypermedia types like Siren or Uber for our APIs and by documenting our application semantics using profiles in ALPS.

If you need more convincing then please read RESTful Web APIs, which Leonard Richardson co-wrote with Uber and ALPS creator Mike Amundsen. This is easily the best book on REST on the market today.

The Decorator Pattern

decoratingOne design pattern that I don’t see being used very often is Decorator.

I’m not sure why this pattern isn’t more popular, as it’s quite handy.

The Decorator pattern allows one to add functionality to an object in a controlled manner. This works at runtime, even with statically typed languages!

The decorator pattern is an alternative to subclassing. Subclassing adds behavior at compile time, and the change affects all instances of the original class; decorating can provide new behavior at run-time for individual objects.

The Decorator pattern is a good tool for adhering to the open/closed principle.

Some examples may show the value of this pattern.

Example 1: HTTP Authentication

Imagine an HTTP client, for example one that talks to a RESTful service.

Some parts of the service are publicly accessible, but some require the user to log in. The RESTful service responds with a 401 Unauthorized status code when the client tries to access a protected resource.

Changing the client to handle the 401 leads to duplication, since every call could potentially require authentication. So we should extract the authentication code into one place. Where would that place be, though?

Here’s where the Decorator pattern comes in:

public class AuthenticatingHttpClient
    implements HttpClient {

  private final HttpClient wrapped;

  public AuthenticatingHttpClient(HttpClient wrapped) {
    this.wrapped = wrapped;
  }

  @Override
  public Response execute(Request request) {
    Response response = wrapped.execute(request);
    if (response.getStatusCode() == 401) {
      authenticate();
      response = wrapped.execute(request);
    }
    return response;
  }

  protected void authenticate() {
    // ...
  }

}

A REST client now never has to worry about authentication, since the AuthenticatingHttpClient handles that.

Example 2: Caching Authorization Decisions

OK, so the user has logged in, and the REST server knows her identity. It may decide to allow access to a certain resource to one person, but not to another.

IOW, it may implement authorization, perhaps using XACML. In that case, a Policy Decision Point (PDP) is responsible for deciding on access requests.

Checking permissions it often expensive, especially when the permissions become more fine-grained and the access policies more complex. Since access policies usually don’t change very often, this is a perfect candidate for caching.

This is another instance where the Decorator pattern may come in handy:

public class CachingPdp implements Pdp {

  private final Pdp wrapped;

  public CachingPdp(Pdp wrapped) {
    this.wrapped = wrapped;
  }

  @Override
  public ResponseContext decide(
      RequestContext request) {
    ResponseContext response = getCached(request);
    if (response == null) {
      response = wrapped.decide(request);
      cache(request, response);
    }
    return response;
  }

  protected ResponseContext getCached(
      RequestContext request) {
    // ...
  }

  protected void cache(RequestContext request, 
      ResponseContext response) {
    // ...
  }

}

As you can see, the code is very similar to the first example, which is why we call this a pattern.

As you may have guessed from these two examples, the Decorator pattern is really useful for implementing cross-cutting concerns, like the security features of authentication, authorization, and auditing, but that’s certainly not the only place where it shines.

If you look carefully, I’m sure you’ll be able to spot many more opportunities for putting this pattern to work.