From the comments below, several very interesting resources have surfaced. The most damning of these articles is an nine-year-old bog entry by Roy Fielding (father of REST). This quote appears:
What needs to be done to make the REST architectural style clear on the notion that hypertext is a constraint? In other words, if the engine of application state (and hence the API) is not being driven by hypertext, then it cannot be RESTful and cannot be a REST API. Period. Is there some broken manual somewhere that needs to be fixed?
So, categorically, unless we are using HTTP to directly manage state - and not merely as a messaging layer - we are not RESTful.
Kudos to the comment from Kasey Speakman which included the link to Martin Fowler's article on the Richardson Maturity Model (RMM) which lead to the Fielding entry.
Also thanks to all those with constructive comments that helped square away my understanding of a RESTful architecture. While I don't care for the coupling that comes from intertwining the app with the protocol used to communicate between client and server, its clear that decoupling those items _violates the intended definition of REST.
Where to go from here?
Not sure yet. I owe it to myself to take a look at the "HATEOAS (Hypertext As The Engine Of Application State)" mentioned in the Fowler article on RMM. I need to reconcile the dual role of HTTP to provide both transport and management of application state.
note bene: I'm really looking forward to comments that begin, "What about...," or "In addition, ..." as its a great benefit to enumerate additional considerations. Thanks
We develop and operate a private web portal for clients. That portal allows clients to login, save information, license intellectual property, and generate highly customized, print-ready PDFs for use in their business. As such, the API is not open and we do not integrate with other apps or services. The lack of such integration grossly simplifies what we need to do as it eliminates many of the normal benefits received by following the current convention. That said, ...
A faithful RESTful implementation necessitates use of appropriate HTTP verbs.
this motivation is no longer a concern. However, I'm not finished chewing on the bone I found.
The reason for pursuing this line of thought at all was out of concern for reducing the amount of information exposed in server log files. So, it began
more as a curiosity than as a possible solution to a specific problem. To that end, this reads better as thinking out loud rather than a manifesto for revolution.
If I'm using the usual and customary RESTful implementation, the path (but naturally not the non-GET payloads) appears in the server logs. A reasonable application of knowledge and an exposure in authored or incorporated software could lead to a breach. While I like the idea of being able to scrape server logs for intelligence, in practice I just don't do it. There are other, more intelligent means by which to gain that kind of insight regarding usage. Therefore, information is being exposed without offering any value in return for the perceived risk.
The client-side software and the server-side software don't require the use of HTTP Verbs to communicate intent -- those two pieces can communicate with each other, provided the browser and web server can communicate.
ergo, we require only that the communications are successful; and ask the browser and web server (Apache, iis, nginx, etc) to provide the comm link or information about why the comm link is unavailable. That's their job, let them do it. But don't ask them to take on responsibility for the entire solution.
The alternative: encapsulate solution-specific server-side response codes and data within the payload.
To that end, we elect to simply prefer the POST method with a message header and message body to communicate intent currently conveyed via HTTP verbs. We accept use of the GET verb where we don't mind exposing our RESTful structure (...or wish to permit bookmarks or support caching).
The message header contains two field (hidden inputs, if you will):
- request_type : New | Create | Edit | Update | Show | List | Delete
- request_path : /resource/id/resource2/id2/...
The message body contains all the usual suspects.
*---------------* | request_type | | request_path | +---------------+ | field1 | | field2 | | ... | | fieldN | *---------------*
Solution-specific, server-side responses are provided with a response header and response body. The response header takes on the responsibility for conveying the status of the request (replacing use of HTTP status codes formerly used for this purpose) while the response body contains any requested data as per usual.
HTTP headers, ... HTTP_STATUS: 200, 400, 404 data: *------------------* | response_status | +------------------+ | field1 | | field2 | | ... | | fieldN | *------------------*
I think this decouples the various browsers and web servers from my client code and server-side code. I still have to use POST and GET, but I only have to use POST and GET. And GET could be optional and rare, since I still want to eliminate information from server logs.
- Caching will be affected by using POST where GET is a legitimate candidate; to be investigated
- Bookmarking will be affected by using POST where GET is a legitimate candidate; to be investigated
- Currently accepted principles will be violated by using POST where GET is a legitimate candidate; tradeoffs will be evaluated
(Still too long; but I'm curious)
Here's a thought process that was radial and branching in nature, but had to be tortured into a linear space, here.
What good are you?
I've spent a few years torturing our web app from a hacked solution towards a more RESTful solution. I love the simplicity of the relative path address path and the nature of the seven operations - eight for us, since I've got a parking lot item to handle search as a first-class request.
But during that time, I've struggled with the implementation of HTTP verbs and REST. Put simply, HTML doesn't support use of all VERBs. Form methods may only be GET or POST and one can fairly ask why support GET. However, the PUT, PATCH, and DELETE options are not supported.
Barring some unknown consideration, it makes sense to me that the HTML form spec allow for:
- method=POST to create a resource
- method=PATCH to update a resource
- method=PUT to replace a resource
- method=DELETE to delete resource
So what good are you? we jumped through some hoops only to have my intentions squashed on the server. No thanks.
HTTP Status Codes
Found this resource which nicely captures the use of HTTP status codes to response to HTTP verbs. However, the vast majority have more to do with app response than web server communications response to the browser:
|HTTP verb||status code||App Scope||Web Server Scope|
|POST (create)||201 created||Yes||-|
|POST||404 not found||Maybe||Maybe|
|GET (read)||200 (OK)||Yes||-|
|PUT||204 (no content)||Yes||-|
|PUT||405 (method not allowed)||Yes||-|
|PATCH||204 (no content)||Yes||-|
|PATCH||405 (method not allowed)||Yes||-|
|DELETE||405 (method not allowed)||Yes||-|
Instead of this mess, I'd rather just focus my communications between browser and web server on the 200 (OK), 400 (bad request), and 404 (not found) responses. Either communications was good and I ask the app to continue processing using the data returned, or the end point was not found (404) and we handle that response as appropriate.
The HTTP status codes can provide some guidance for app return codes in that we may want a standardized response where
|Stolen HTTP Status Code||means what it means|
|201||success on creation of resource|
|204||there is no content to be displayed|
|404||bad resource id|
|405||method not allowed|
|409||resource already exists (duplicate?)|
|418||in the event we ask a teapot to brew coffee|
But, these are returned in an app response header, not the HTTP response.
Client Side Hassle
This comes down to the incurred cost of developing or using a package to support AJAX communications that are formally correct but get tortured into a POST by the time the web server passes the request to server-side code. It may only become a personal preference to branch off a request header rather than a verb in the communications overhead. Either way, the detection work still needs to be done. But I feel better working off the request than the communications header.
Server Side Hassle
...continuing, pulling my action detection out of the comms header and into a request header allows for separation of web server code from request code. That is, my server script end point merely has to pull the request header and body from the web server and feed it to the server-side code. Now, we could interpret the web server request and translate it for the server-side code, by why introduce that action at all ? Why not decouple altogether and simply state the action in a request header which passes untouched from the client-side to the server-side ?
Revisiting the whole use of HTTP status codes to tell me about something that happened in the server-side code (beyond the web server). Just as I dont want the server-side code using information from the comm header (GET|POST) to know what action to take (NEW|CREATE|EDIT|UPDATE|SHOW|LIST|DELETE), I don't want the client-side code to take app-related action based on information in the comm response.
Why should the comm response be anything (for a web app) beyond 200, 400, 404 ?
Now, in the event, the API is to be exposed for third-party usage that (historically) expects app-related responses from the comm layer, I get it. The legacy reasons for including those responses may be overwhelming.
I also expect this will have mocking/testing benefits although a proper decoupling of server side code may have accomplished this anyway, yielding no incremental benefit.
Standardizing the request header and request body on both the server and client sides means only dealing with a single model for creating and responding to a request. Check.
Reducing the ajax communications to concerns related to GET, POST and the pass/fail nature of the communications link streamlines communications processing between client and server. Check.
Implementing a response body into the data portion of a response complicates the client-side processing. A bit. Ugh.
Oh, and I don't expose any of our structure in server logs. When the path goes into the payload instead of the URI, the structure is hidden. That may not have been a concern a decade ago, but I am beginning to think it should be more of a concern today. Just as we played fast and loose with all of early IoT toys around the house, we may be flirting with danger allowing our resource end points to appear in server logs.
Thanks for taking the time to read this. And even more thanks if you choose to respond/comment with constructive criticism. Kudos.
Top comments (24)
If your primary concern is information expose in server logs (why I’m not sure, but okay), reconfigure your server to log less information. Done. You don’t have to upend the entire HTTP philosophy just for that. A great benefit of a RESTful architecture is its standardization. You don’t have to reinvent the wheel and teach it to every new developer. If you’re vaguely familiar with RESTful conventions, a RESTful API is almost self-documenting. That can save a lot of headache down the line. Not to mention standard behavior of caching and such. You’re basically locking yourself out of the synergies of the existing RESTful paradigm which could enable you to develop and scale quickly.
Yep, I agree here. This seems like a lot of work to avoid an issue that should be solvable by a configuration change on your webserver.
Fair point on the destination server.
What about the hops between?
And to clarify, the question of server log content only began the thought process; it's not a motivating factor for making a change.
I would question why the little bit of information that may appear in server logs may in any way lead to breaches of any sort in the first place. If your security depends on the exact URL structure of your server being secret, your security is non-existent.
"Hops in between" for HTTP that matter at all to this discussion would only include SSL-terminating HTTP proxies, and they can log the fully payload if their operators so desired and there's nothing you could do about that. Presumably any such proxies would be fully trusted by either the server or the client or both. So for our purposes, intermediate hops are irrelevant.
Thought that encryption applied to the POST payload, but not the information appearing along with the URL (GET parameters?).
Encryption is only applicable in end-to-end scenarios, in which case intermediate hops are totally irrelevant - you either don't have them at all, or they're just TCP proxies / IP routers which see nothing but the TCP/IP headers.
If you have a ssl-terminating proxy (i.e., it intercepts your SSL traffic and re-encrypts it to relay it to the final destination) it can log everything.
There's nothing in between.
In other words:
You're only making your own life more difficult for little to no advantage that I can discern. Do not underestimate the advantage of a properly implemented HTTP API when you find that you need to start scaling up. It's not that easy to throw a CDN or additional servers into the mix when you've painted yourself into a corner with a custom protocol that is uncacheable/unproxiable/unwhateverable.
This is the type of "what about" comment I was hoping to get.
I think I've conflated the RESTful approach to utilizing the capabilities of HTTP with using HTTP verbs and status codes to satisfy the CRUD requirements of a web app.
I'm going to have to think about separating some concepts more clearly.
As a thought exercise, why should a RESTful architecture depend on HTTP ?
Sure, you could relegate HTTP to be a mere transport mechanism and implement your own "RESTful" protocol on top of that. Does that provide any advantage? Does it allow you to easily switch the transport layer to something else down the line? Is that a foreseeable requirement? Also see Inner-platform effect.
Good read. Thank you.
And the somewhat related en.m.wikipedia.org/wiki/Not_invent...
It's possible to avoid using HTTP verbs, but I would not call this approach RESTful. Depending on the context it can be a good solution, but, probably, not a universal one. One problem that I see with this approach is the fact that to act on a request you need to parse the request body because the standard HTTP data does not provide enough information. For example, if you need to route some requests to a different backend using NGINX, you will have to parse the request body instead of using URL-based pattern matching.
Regarding the URLs in the logs, what about configuring your servers so that they don't log requests? And I don't see a big problem with it.
Fair criticism - as we currently implement RESTful solutions.
I came to question why a RESTful solution (the purview of the app) should depend on the communications protocol (HTTP).
Turning it inside out, what if the available HTTP status codes don't provide enough information on the nature of a response to provide an effective solution ?
Why can't/shouldn't we have a RESTful solution without relying on HTTP ?
You can do it without relying on HTTP but you need to follow constraints: en.wikipedia.org/wiki/Representati... In your suggested approach, I see the following broken constraints: Cacheability (you can build your own without relying on HTTP though), Uniform interface (because when you GET a web page in the browser, you should be able to update it with a PUT back). You can achieve uniform interface but, then, I believe your client should not be the browser itself but something running inside the browser. But that's all vague anyway. More important is that you will be building a protocol on top of HTTP, you will not be able to use some the standard tools and conventions.
Thank you. This is precisely the kind of "what about" comment I was hoping to get
Several ideas working their way through my thoughts right now between your comment and David Zentgraf's, above.
I like this article, and I think you have made an important realization. In practice I -- along with most people -- have never done a Level 3 REST APIs, because they cost a lot to develop (for API consumers too). But going less than Level 3 does not provide the supposed benefits of a uniform interface. Nor does it even count as "REST" according to its creator. Yet obviously many people see the benefit of using some of the patterns from REST for specific benefits. Examples: content negotiation, hypermedia, scaleability from statelessness and/or cacheability.
I gave up on REST myself. I have taken to using HTTP as just a transport protocol. With a simple interpretation of the contents as a message. Instead of "media types", my client apps know how to work with "message types". These are business-level requests/responses, as opposed to "resources" (which a lot of people believe to be database tables). All messages are sent to a few well-known endpoints of the given API. This does not prevent the API from being scaleable or using content negotiation and hypermedia. But I am not required to use them either. I can start simple and layer on these complexities as the need arises. Since using this strategy, I run into far less situations where I am puzzled trying to figure out how to squeeze my use case into REST constraints. Instead, it's more about puzzling to understand the business's precise problem so I can determine the right messages to add.
The Fowler link lead to the Fielding link.
Results of those readings are included in my original post as "update 2".
Very nicely done.
I've implemented several REST API's with nearly POST only. And they are damn clean, you can easily generate SDK e.g. from type-safe C# to type-safe TypeScript and you can read it like you would a normal API.
My goal with REST API's is clean SDK, e.g.
In web server I have Namespace.SomeController.Save it will be: /Namespace/Some/Save/ and it will generate Namespace.Some.Save() function to TypeScript. Highly recommended.
You don't need to be verb purist (for some reasons you already covered), but key to me is to have unique url for all endpoints to keep it clean and easily generateable.
That isn't a REST API, by any measure of the word. It's a case of en.wikipedia.org/wiki/Remote_proce.... That's not to say that that's inherently bad (far from it), but to call it 'REST' it needs to at least have a reasonable amount of overlap with what is generally considered 'REST'.
Ha. Good point. When challenging assumptions, its always best to take the temperature of the room.
However, I still think its worth thinking about whether what we communicate is inseparable from how we communicate.
Could just tell them "it's RESTful, but the implementation is over here. We only use HTTP for communication."
She might still quit, but only roll her eyes on the way out.