Today I'd like to present the draft of the OHM format, a new media type for Hypermedia/REST level 3 applications that relies on OpenAPI to describe the Hypermedia controls.
Description of the OHM format
TL;DR : the format is called OHM for Openapi-HyperMedia. Its mediatype is application/ohm+json
and the definition is :
{
"content": <JSON representation of the resource>,
"controls": <An OpenAPI Specification (OAS) in JSON format containing the possible operations from this state>
}
... and that's it. Have you already seen a format definition as small as this one ?
For instance a OHM message for a Person
resource which has a collection of Address
could look like this:
{
"content": {
"firstName": "Jane",
"lastName": "Doe"
},
"controls": {
"openapi": "3.0.1",
"info": {
"title": "rest",
"version": "0.0.1"
},
"paths": {
"/api/addresses/1": {
"get": {
"summary": "John Doe's address"
}
}
}
}
}
The content field contains the information describing a Person called Jane Doe in the usual application/json
media type. And the controls
field is an OAS with a link to Jane Doe's address. An OHM client can then be used to navigate from the Person to its address without prior knowledge of the path, method, arguments, etc... that must be used.
It's also possible to use OHM to describe write operations:
GET /api/orders/106
{
"content": {
"id": 106,
"product": "Licensed Concrete Keyboard",
"cost": 571.0,
"customer": {
"id": 1,
"name": "Ricky Swift"
}
},
"controls": {
"openapi": "3.0.1",
"paths": {
"/api/orders/106": {
"put": {
"summary": "Update order 106",
"requestBody": {
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Order"
}
}
}
}
},
"delete": {
"summary": "Delete order 106"
}
},
"/api/customers/1": {
"get": {
"summary": "Get this order's customer (id=1)"
}
}
},
"components": {
"schemas": {
"Order": {
"type": "object",
"properties": {
"cost": {
"type": "number",
"example": 571.0
},
"customer": {
"type": "object",
"properties": {
"id": {
"type": "integer",
"example": 1
}
}
}
}
}
}
}
}
}
This is a classical CRUD resource where a PUT operation can be used to update the resource or a DELETE one to remove it. We can note that OpenAPI helps to fully describe the expected request body of the PUT operation and its media type.
The rest of this post will explain how to use this format to build better APIs truly embracing REST concepts with HATEOAS.
OHM is a media type for REST
First, what is a true REST application (also called REST level 3)? If you already know, you can skip this whole chapter. But I talk with a lot of people who say they do REST APIs but don't realize that what they do is actually good old RPC (Remote Procedure Call) over HTTP.
Quoting Roy Fielding, the inventor of REST:
"What needs to be done to make the REST architectural style clear on the notion that hypertext is a constraint? In other words, if the engine of application state (and hence the API) is not being driven by hypertext, then it cannot be RESTful and cannot be a REST API."
A key constraint of REST is thus to use Hypermedia As The Engine Of Application State (HATEOAS). It was at the core of the REST dissertation and even in the name "REpresentational State Transfer" and that's what makes this architectural style so unique. HATEOAS means that messages received from the server contain all the necessary information to drive the application. That's exactly how navigation works with HTML documents : you go from one state to the other using links and forms. So REST is used everyday when you go on the Web and navigate HTML documents with your browser. But never when you call a Web service through an HTTP+JSON API...
OHM provides HATEOAS support through the controls
field. If we compare with HTML, it's the equivalent of anchor links (GET) and forms (POST). But as it relies on OpenAPI, it can use all the HTTP methods. The content
field is the equivalent of all the divs and CSS that contain the representation of a resource. In a sense, OHM is a format oriented towards machine (easy to parse) whereas HTML is oriented towards humans (with graphical information).
When to use a REST approach with the OHM format ?
At the time of Roy Fielding's dissertation, the state of Web applications was fully residing in the server.
But with the advent of JavaScript in the browser and the mobile revolution, the state machine of the application has now moved inside the various clients and there's no need anymore to transfer the state transitions from the server. So we just expose Web services from the server and use them with RPC HTTP calls that we describe with OpenAPI.
With a REST format such as OHM, you can put the state management of the application back into the server. This gives the possibility to share parts or the whole application logic between the different types of client (browser, mobile, ...). And you can evolve the application logic without having to update the client which is a major gain when working with native mobile apps.
The advantage compared to HTML is that the view is still built by the client and so it can be used with non-HTML clients such as native mobile ones.
There's also the case where we don't expose an application but a Web service. In that case the application stays in a client that we don't control and REST has maybe small advantage here. It provides a browsable API for the consumers so they can better understand the natural chaining of API calls but you could see that as a too small benefit compared to the additional work needed to add Hypermedia links.
Except if you use the same endpoints for both your applications and your services. In that case OHM will provide HATEOAS for the application and your clients can still get the service content in the content
identical as if they where using the application/json
mediatype.
Another advantage of using REST and OHM for Web services is that you can provide an endpoint that lists all the possible operations that are available for clients. A bit like if you provided a single HTML page with all the links and forms that exist on your Web site. With a shared semantic between the client and the API (for instance using the OpenAPI field operationId
), you can create clients that build the URLs and requests dynamically and will adapt even if you change the URLs or the parameter disposition on the server. Of course, if non-REST clients use the API, it will still be needed to version the API and ensure backward compatibility for these clients.
It's not the primary intent of OHM to provide full automation and interoperability between systems. Although this can certainly be built by adding vocabularies on top with things like JSON-LD. I'd be very interested to get opinions and ideas on this.
The power of the OHM browser
One big advantage of a REST format such as OHM is that you can build generic clients that understand any application that support the media type. For instance, for OHM you can use the OHM Browser which is a modified swagger-ui that navigates the links when you execute instead of displaying the HTTP call response (the sample app it points to by default is hosted on Heroku free tier, so it can take a little time to wake up. Please be patient π )
You can test it with a sample OHM application built with JHipster that represents customers that have orders that you can create, read, update, delete with navigation links and that also supports pagination. The source code for this application is located here.
How does OHM compare to other REST formats?
A lot of HATEOAS formats only support navigation links (GET) and not actions that modify the remote system (POST, PUT, ...). As it relies on OpenAPI, OHM supports all the common HTTP methods. It also fully describes the operation links, their call parameters and can embed documentation on how to use them in the messages.
OHM supports all the H Factors described by Mike Amundsen.
You don't need to learn a new format if you already know OpenAPI.
There's a lot of Open-Source tools in the OpenAPI landscape that can be reused to support OHM. For instance the OHM-browser is based on swagger-ui, the sample app uses the OAS generated by Springfox, ...
It's relatively easy to transition an HTTP+JSON API to OHM by putting the former JSON response in the content field and adding the links progressively.
Conclusion
I truly believe that REST can bring a lot to web applications. Hopefully, OHM makes it easy to adopt if you're familiar with OpenAPI. Don't hesitate to give it a try and show what you build with it !
The next steps for me will be to grow the eco-system starting by an helper Java library (which just needs to be extracted from the sample application code). I also think it should be possible to integrate with Spring-Hateoas so I'll have a look into that. Don't hesitate to reach me if you have a language of heart on which you would want to have a helper library. Oh, and I won't blame you if you continue to call RPC HTTP APIs as REST, everybody does it anyway π !
You can follow me on Twitter.
OHM links:
- The OHM specification: https://github.com/cbornet/ohm
- The OHM browser: https://raw.githack.com/cbornet/swagger-ui/ohm/dist/
- A sample OHM application: https://rest-openapi-demo.herokuapp.com/api (https://github.com/cbornet/sample-rest-app)
More readings on REST and Hypermedia:
- "Get Autonomy and Resilience for Free With Hypermedia APIs" by Mike Amundsen : http://amundsen.com/talks/2020-10-apiworld/2020-10-apiworld-hypermedia.pdf
- Roy Fieldings' REST dissertation : https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
- "REST APIs must be hypertext-driven" by Roy Fielding : https://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
Top comments (10)
Hi, this is an interesting idea.
Unfortunately, providing all the actions dynamically in the "controls" field seems to be limiting the use of this to quite generic, interactively used tools (as far as I can see) like your OHM browser, and therefore likely won't fit for the majority of HTTP APIs out there.
As you've linked here from OpenAPI-Spec #577, you of course know my point of view here: As a client programmer, I would like to have the available operations (or at least a set of which operations could be available) at "client implementation time", not just at runtime, so I can build my code around this β both for custom UIs (e.g. a customer care UI for your customers/orders example), and for server-side-only API usage (e.g. actually processing the order, by triggering the shipment in the warehouse).
Hi Paulo,
What you're describing, getting available operations at "client implementation time", is indeed not HATEOAS and is not the purpose of OHM. For this there is the static "links" field of OpenAPI.
As I tried to say in the blog, HATEOAS is not a good fit for external applications using the API as a library since the application state is not in the API but in the client consuming it.
So generally you would use HATEOAS for your own web applications, not for external ones and not for micro-services.
For documentation, the advantage is providing navigability of the doc which can bring more context than the "links" object of OpenAPI.
Also, one of the advantages I'm seeing on the project I'm working on is that with the OHM-browser, we can provide a usable UI for the application before the front-end is ready.
I have to agree with Paulo here and I think you're mistaken about HATEOAS.
Getting available operations at "client implementation time"
is most certainly HATEOAS as the client MUST understand the hypermedia in order to do anything with it. You even said it yourself in the blog post with "Another advantage of using REST and OHM for Web services is that you can provide an endpoint that lists all the possible operations that are available for clients." You say HATEOAS is not a good fit for external applications, but that is what all the web is. The browser is a client for the HTTP protocol, and how many different browsers exist? There is an excellent talk by Jim Webber about REST application design at vimeo.com/41763224.Swagger, and by extension OHM are too focused on the PATH. The PATH should be opaque to the client. A server should be able to change all it's URL's to GUID's and not break any clients. It's the resources and the link relations that tie them together that is important. A great example of REST documentation is the Amazon API Gateway rest API docs. It uses HAL+json as it's media type and defines the resources, link relations, and starting api endpoint.
Sorry but I can't agree with that. HATEOAS provides affordances through links dynamically at runtime, not statically at client implementation.
The "PATH" is opaque in OHM. Look at the OHM-browser : it doesn't know any path before using the application.
You can totally do the same with OHM. The advantage is that OHM supports all H-Factors whereas HAL supports only a few. HAL supports only navigation links through GET. OHM supports all types of operation, can describe forms, etc...
I agree with that. What I was trying to say is that the client still needs to be able to understand what those opaque links represent by knowing at implementation time the set possible of link relations it MIGHT encounter. Not that it's preprogrammed with the workflow steps/navigation, but if it sees a "next" or "user-disable" link relation, it knows how to proceed/present that affordance to the user.
I must have missed this piece, is OHM meant for documentation or are you proposing it as an implementation hypermedia type?
That's the semantic of the application that must be a shared understanding between the application and the user-agent. In HTML, that's generally human language. In OHM there are several possibilities : for humans (or any intelligent form), you can use the "summary" and "description" fields of the operations ; for computers (or anything with a limited vocabulary) you can use the "operationId" field of the operation. As said in the post, OHM doesn't define a vocabulary so you need to define your own or reuse an existing one.
It's an Hypermedia type. I thought it was clear in the title. Apparently not... π
Interesting approach. Was any thought given to using dynamically generated enums for the values of query and path parameters rather than hard-coding query strings in the pathItem keys (which is technically not allowed by OpenAPI)? As you are generating pathItems not fully formed URLs I think this could be a valuable improvement.
I also note that there are no responses keys within the operations generated by the sample app - this won't be a problem in OAS 3.1 though where they are optional.
Oh, and HTTP uses "methods" not "verbs" and just because an HTTP API doesn't use HATEOAS does not make it RPC. It can still be resource orientated not action oriented and use more than one method per URL. :) dev.to/mikeralphson/why-there-is-n...
Thanks a lot Mike for the detailed and very valuable review and comments.
Indeed, single-value enums are the way to fix parameters. It worked for swagger-ui to pass query params in the path element but I agree this is not standard. I'll update the sample app.
I'll also update the sample to output the responses field. Here also, it worked with swagger-ui but better be fully spec compliant !
I'll change "verb" to "method" which is more correct (although I've yet to see an HTTP method which is not a verb π).
But I think RMM REST level 2 APIs are still RPC.
GET /api/users/
conveys the exact same semantics asgetUsers
. Separating the resource and the verb into distinct fields of the request doesn't change that. The best proof is that it's possible to generate client SDKs that do the translation for a given programming language. Another proof that it's easy to replace these APIs any time with another RPC (gRPC, SOAP, etc...). And I think that was the point of Fielding's rant. Note that I don't say RPC is inferior (I love gRPC), just that it's not REST as defined by its author. Anyway, I know that the battle is lost and that REST level 2 is now REST. So that's just historic details. As we say, "usage is law".Thanks for the reply, as I said I do find this approach interesting and worthwhile. I blogged about returning partial OAS documents here: dev.to/mikeralphson/the-hypermedia...
If you use single value enums, won't you hit the problem of clashing (identically-defined) pathItem keys? How would you distinguish them? Hence my suggestion to dynamically build the enum lists based on allowable values for the parameter (maybe within a page range etc to prevent the list being too large).
Fielding (2000) says: (In section 6.5.2 'HTTP is not RPC') "What makes HTTP significantly different from RPC is that the requests are directed to resources using a generic interface with standard semantics that can be interpreted by intermediaries almost as well as by the machines that originate services. The result is an application that allows for layers of transformation and indirection that are independent of the information origin, which is very useful for an Internet-scale, multi-organization, anarchically scalable information system. RPC mechanisms, in contrast, are defined in terms of language APIs, not network-based applications." For example, HTTP APIs are generally cacheable, RPC mechanisms generally are not.
"HEAD" and "OPTIONS" are the two obvious non-verb standard HTTP methods. :)
"Great minds think alike" π
Yes. That's a problem of OAI using a map for pathItems. I could have chosen to directly reference an array of pathItems in controls but then it wouldn't be 100% compatible with existing tools and libs (eg. swagger-ui) which is I think a strength ot the OHM format. Eg in the paging implementation, you can have 2 links with the same path+method (eg. "first page" and "previous page"). I worked around it by setting a fragment in the path to distinguish the 2 affordances. This fragment being ignored server-side. The problem of an enum list is that if you have several params, you don't know which combination of enum is allowed. And I don't see how to associate an affordance (eg. "first page") to an enum value.
I opened github.com/cbornet/ohm/issues/1 to track this issue.
If Roy says it, I guess this is it ... π . But we still call procedures remotely...
I should get some sleep π€¦...