Why SnipURL’s API is Unsafe a.k.a. How NOT to design your Web API

Sigh.

If you’ve read my blog, you know that I like well designed URLs. You’ve probably also discerned that I like web APIs with well designed URLs. And further you might be aware that I like APIs that are as lightweight as possible that use; i.e. ones that use PUT and/or POST content of mime-type “application/x-www-url-encoded” and that respond with a body in of the “text/html” or “application/json” mime-types. So on the surface you might think that I’d really like the SnipURL API; after all it is the epitome of simple.

But unfortunately, I don’t like SnipURLs API. You see, it violates one of the most important tenants of HTTP which is “GET requests MUST be ‘safe’; i.e. a GET should have “no side effects.” SnipURL allows programs to use HTTP GET to issue a request to add a URL to their database of “snips.”

For example, to “snip” the URL “http://blog.welldesignedurls.org/url-design/” and enable the URL “http://snipr.com/urldesign” to redirect to it simply type the following URL into your browser (your browser will issues a “GET” request to SnipURL’s server API):

http://snipr.com/site/snip?r=simple
&link=http://blog.welldesignedurls.org/url-design/
&title=URL+Design
&snipnick=urldesign

This GET could also have been issued by a programming language, which of course is the whole point of an API. Unfortunately, the fact that the SnipURL server adds a record to their “snip” database on a GET request is a side effect and violates the HTTP spec. But before you think I’m just being a pedantic standardista with my undershorts on too tight, take a gander at the firestorm that resulted when Google released it’s Google Web Accelerator in early 2005. GWA caught lots of web developers with their pants down, including the golden boys of Rails who evidently hadn’t read the HTTP specs either (to their credit, they’ve made great strides since.)

So when I realized their API was violating the spec I of courseemailed their Google group with comments very similar to the ones I left on Douglas Karr’s blog where I first learned about SnipURL in hopes they could nip any damage in the bud. I’d point you to my email on the list but unfortunately they moderated my comments. On the other hand they did reply back to me in email. They responded with the following otherwise cordial email that was filled with mistaken assumptions:

Hello,

Thanks for an informative post.

Snipurl’s API is based on requests from our own users who wished to use our API, who stated clearly that they would prefer not to have to go through the rigmarole of charting through XML return values, and instead preferred a simple snip returned to them.

As for the comment: “GET is used for safe interactions and SHOULD NOT have the significance of taking an action other than retrieval.”

This is exactly what Snipurl’s API does. While we “get” the form values, they are rigorously checked for validity (which is in our own interest; otherwise miscreants could ruin our system) and only a value is returned.

The POST method is also significantly more resource-intensive when you talk of non-trivial traffic. Because we are a free service, we chose our approach based on practical reasons rather than puritanical design goals.

Again, it is great to have some feedback from the community. If I can find the time, I’ll be doing a REST API sometime, as has been on the cards anyway.

Many thanks
SnipURL Editor

So first off it is clear from their reply that their knowledge of HTTP is limited and that they are making assumptions that don’t follow from the comments I sent them. But I’m not posting their email to attack them or make fun of them, I’m posting it to illustrate that many professional web developers today don’t know the rules of HTTP, probably more than half. Instead I’m using their email as an example in hopes to educate those who, like me for the first 10+ years of my web development career, just let their tools isolate them from HTTP (yes I’m talking to you Visual Studio, IIS, and ASP/ASP.NET.)

So let me address their points one at a time.

First they said their users wished to use an API without having to “go through the rigmarole of charting through XML return values.” Obviously they were reading my email in haste and assuming that I was advocating XML when I most definitely not. And it is all the more ironic that they assumed XML advocacy given my recent anti-XML rant on the rest-discuss list. To be clear, I didn’t suggest the use of XML but what I did suggest (using PUT or POST) is no harder than programming a GET.

Second they claimed that their “rigorously check for validity” made their GET “safe” demonstrates a fundamental ignorance of what the term “safe” means related to HTTP GET. The term has a very precise meaning according to the spec, and for those that don’t know it I highly recommend they read “Safe Interations” from the W3C Technical Architecture Group’s Finding “URIs, Addressability, and the use of HTTP GET and POST.” In addition the checklist explaining when to use GET vs POST from the same document is both short and highly readable so web developers who’ve never learned the HTTP spec in detail should at least read that.

Lastly they claimed that “the POST method is significantly more resource-intensive than non-trivial traffic” (when compared to GET) means they really do not understand GET or POST. Let’s look at the difference between the two; the first is a GET and the second is a POST (NOTE: I’ve wrapped the HTTP requests to keep from overflowing the blog’s borders but you would not wrapped them in an HTTP request. The wrapped lines start with “>>>”):

GET http://snipr.com/site/snip?r=simple
>>> &link=http://blog.welldesignedurls.org/url-design/
>>> &title=URL+Design
>>> &snipnick=urldesign
>>> HTTP/1.1

POST http://snipr.com/site/snip HTTP/1.1
<blank line goes here>
r=simple
>>> &link=http://blog.welldesignedurls.org/url-design/
>>> &title=URL+Design
>>> &snipnick=urldesign

So where’s the resource intensiveness of the latter? The latter actually transmits fewer characters over the wire (not that a few characters make any difference except at “Yahoo-scale.”) What’s more, the overhead of writing the snip to their database will be orders of magnitude more overhead than the difference between GET vs. POST (if there even is any difference), though their may be some trivial differences on some server frameworks.

But better than a POST, I’d recommend a PUT. Note how I PUT to the URL that we want to be our snip URL thus eliminating the awkward “snipnick” parameter (this PUT’s body is wrapped too):

PUT http://snipr.com/urldesign HTTP/1.1
<blank line goes here>
r=simple
>>> &link=http://blog.welldesignedurls.org/url-design/
>>> &title=URL+Design

So hopefully the SnipURL Editor and anyone else reading this will now realize that it is important to ensure that APIs always use HTTP GETs in a ‘safe‘ manner. Of course if they don’t “get” what I’m trying to say (pun intended), then maybe I should just post the following code to a page on my website (or something similar for other HTTP-violating APIs) and let Google’s spiders have at it. :-)

<html>   

   <title>Spidering SnipURL's Naughty API</title>   

<body>   

<h1>Spidering SnipURL's Naughty API...</h1>   

<dl>   

<?php   

   $url= "http://snipr.com/site/snip?r=simple" .   

      "&link=http://www.w3.org/2001/tag/doc/whenToUseGet?nick=<<nick>>" .   

      "&title=HTTP+GETs+MUST+be+SAFE" .   

      "&snipnick=";   

   for($i=0; $i<=36; $i++) {   

      $nick = $_GET['nick'] . chr( ($i<26 ? 97 : 22)+$i );   

      $nick_url = str_replace('<<nick>>',$nick,$url) . $nick;   

      print '<dt><a href="?nick=' . $nick . '">' . "$nick</a></dt>";   

      print '<dd><a target="_blank" href="' . $nick_url. '">HTTP GETs MUST be SAFE</a></dd>';   

   }   

?>   

</dl>   

</body>   

</html>

Whadayathink? Should I post it to my site? ….. Nah, I’ll be nice today.

P.S. After writing this post it’s occurring to me that maybe SnipURL simply thought I was suggesting they use a SOAP API instead? Clearly all their rationales would apply against SOAP as:

  1. SOAP uses XML,
  2. SOAP “safety” implies validation (I think), and
  3. SOAP is a LOT more overhead than using a GET or a PUT/POST.

Think maybe that’s what SnipURL thought I meant?

This entry was posted in Uncategorized. Bookmark the permalink.

7 Responses to Why SnipURL’s API is Unsafe a.k.a. How NOT to design your Web API

  1. Douglas Karr says:

    Great post and I think it’s fantastic that you took the time to notify them and let them know the issue!

    The concern for me would be someone highjacking URLs from their system and pushing them to another location other than the snip’d url.

  2. Ben Hoyt says:

    Hi Mike,

    I agree with your push for well-designed URLs and simple, RESTful practices. But I must admit I’m not sure you’re correct in this critique of SnipURL’s API.

    It’s true that one should only ever use GET for idempotent queries, but the standard doesn’t actually mandate that GET must always be safe. (I’m using “idempotent” and “safe” here as the standard does: loosely, idempotent means “doing it ten times is the same as doing it once” and safe means “no side effects”.)

    And here I quote from the HTTP spec, section 9.1, http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html (*emphasis* mine):

    “… the *convention* has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered ‘safe’. This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested.”

    “Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, *some dynamic resources consider that a feature.* The important distinction here is that the user did not request the side-effects, so therefore *cannot be held accountable for them.*”

    So according to the spec, the “safeness” of GET is a (good) convention, but its idempotency is a requirement, as the next sections make clear. Note also the spec specifically says that “some dynamic resources consider [GET with certain side effects] a feature”. SnipURL are obviously okay with being “held accountable” for the side effects if the user didn’t intend them (and almost always he will have intended them).

    The reason this doesn’t matter with something like the SnipURL API is that you can repeat the “snip a URL” operation as many times as you want, and it always returns the same result. So the operation is repeatable, cacheable, spiderable, etc. (This is in sharp contrast to other operations where you definitely need POST, such as commenting on a blog, transferring money to someone’s account, etc.)

    As it happens, I run a similar service at DecentURL.com, and early on a guy convinced me to change the main “make a URL decent” action from POST to GET — because a GET is simpler to use in many contexts, and since the action is idempotent, everything’s cool. As he put it:

    “From the browser’s perspective, there is no difference than if the response had always existed for all time prior to the first request. One can cache that response without any perceptible effect, for instance, and bots can request it again and again without damaging anything.”

    So apart from the fact that SnipURL and DecentURL *do* actually adhere to the standard … also note that you’ve got to be somewhat pragmatic about these things. Most other URL redirection services out there use GET, and it works fine, for the above reasons. As Paul Buchheit said on the GET vs POST issue, “I’d rather have something imperfect but useful and popular than something ‘perfect’ but unfinished and unused.”

    Sorry about the long comment. Hope this helps clear up why using GET here is not only okay but also standard. :-)

    Cheers,
    Ben.

  3. Pingback: microBlog » GET, POST, safety, idempotency

  4. Ian Clarke says:

    Why do you advocate formatting data as application/x-www-url-encoded for PUT and POST (rather than, say, application/json)?

  5. Nicolas says:

    You++;

    I just found about this blog, looks like I’ll be reading it often!

    Using GET for unsafe operations is something I have been fighting hard against on a web app. Months later, the devs started using buttons on the UI for things that were “actions”. But keeping GET. And not only that… They didn’t even use a form!

    Clicky

    I kid you not.

    Then it was discovered a button in a link doesn’t work on some browsers. So now the button’s onclick handler (yup, Javascript) uses location.href to go to the correct URL, with GET of course.

    I’m gonna kill somebody.

  6. Nicolas says:

    goddamnit it parsed my HTML.

    <a href="whatever"><button>Clicky</button></a>

  7. Jack the Snipper says:

    Wow. I’m not really sure what I was looking for when I stumbled upon your post, but this was interesting. You are probably taking the standard beyond its intent and interpretation. By that, I’m only speaking to your assertion that adding an entry in a database causes the use of GET to stray from the standard. As a user, following, or GETting, that URL causes no harm that could even be construed as unsafe, by the standard’s definition, for you. That spec is funny anyway because it basically implies that users accept obligations via POSTs.

    By what appears to be your definition, every search form that doesn’t specify a method would be unsafe if the developer of that form made any attempt to save given search parameters for their own analysis. The very fact that SnipURL retaining an index to match what you provide them to what they provide you caused you to broadcast your overinterpretation of a standard to the world is hilarious!

    What action, in terms of adding an index entry into their database, might have an unexpected significance to you or others? Their use of get with an underlying database entry on their end does not fall outside of the standard. You are a standarista.

Comments are closed.