Archive for the ‘Participation’ Category

URLQuiz #2: URL Equivalence and Cachability

Thursday, March 1st, 2007

This is quiz #2 of our ongoing URLQuiz series.

In this quiz, there are 26 pairs of URLs (A..Z) and for each pair the questions is: “Which of these two URLs are equivlent?” i.e. which return the same resource when dereferenced, and “Which can be cached as the same URL?

You should answer ‘Yes‘, ‘No‘, or ‘Maybe‘ where ‘Maybe‘ means ‘It might return the same resource but should not be cached according to the specs.’

To answer, leave a comment and ideally explain your reasoning for each. Feel free to group answers based on your reasoning and/or the answer given (Yes, No, and Maybe.) Print it out and take the quiz with pencil and paper if you serious about getting it right, and feel free to use a computer or browser or whatever to test your results before answering. Good luck!

About the TLD .foo[1]

Clarification (2007-March-02): Some people have stated that the server could possibly return the same resource for any two given URLs so they felt the answer could never be ‘No.’ I definitely see their point, for example http://mysite.foo/bar and http://mysite.foo/bazz could possibly return the same thing but nobody would ever reasonably expect them to do so on their on. So let me clarify to say that I meant a quiz taker to select ‘No‘ in the case where where the resource returned would definitely not be the same thing unless the developer or server admin explicity programmed or configured them to do so. On the other hand ‘Maybe‘ would be used in the case where someone might reasonably expect the two URLs to return the same resource even though the RFC 3986 would define the two URLs as being different such as in [footnote 2], or when it depends on the O/S of the server as in [footnote 3]. Regarding fragments, the question is “In a transaction between a client and a server, is the cache allowed to view them as the same?” Regarding which can be cached I was looking for what are appropriate per the spec, not necessarily whether any particular software in the cloud (i.e. routers, proxies, browsers, etc.) actually does cache but instead “Would it be allowed to cache?”

Questions

  1. The ‘www’ domain
    1. http://mysite.foo/
    2. http://www.mysite.foo/
  2. Letter casing in path
    1. http://mysite.foo/Index.htm
    2. http://mysite.foo/index.htm
  3. Letter casing in domain
    1. http://MySite.foo/index.htm
    2. http://mysite.foo/index.htm
  4. Index.htm vs. Default.aspx
    1. http://mysite.foo/Index.htm
    2. http://mysite.foo/Default.aspx
  5. Trailing slash on domain
    1. http://mysite.foo
    2. http://mysite.foo/
  6. Trailing slash on path
    1. http://mysite.foo/path
    2. http://mysite.foo/path/
  7. Empty question mark
    1. http://mysite.foo/
    2. http://mysite.foo/?
  8. Empty parameter
    1. http://mysite.foo/?
    2. http://mysite.foo/?param=
  9. Port 80
    1. http://mysite.foo/
    2. http://mysite.foo:80/
  10. Port 443
    1. http://mysite.foo/
    2. http://mysite.foo:443/
  11. Https vs. Port 443
    1. https://mysite.foo/
    2. http://mysite.foo:443/
  12. Ftp vs. Http
    1. ftp://mysite.foo/
    2. http://mysite.foo/
  13. Letter casing in parameter name
    1. http://mysite.foo/?param=bar
    2. http://mysite.foo/?Param=bar
  14. Letter casing in parameter value
    1. http://mysite.foo/?param=bar
    2. http://mysite.foo/?param=Bar
  15. Hash vs. no hash
    1. http://mysite.foo
    2. http://mysite.foo#
  16. Hash vs. Fragment
    1. http://mysite.foo#frag
    2. http://mysite.foo#
  17. Fragment vs. no Fragment
    1. http://mysite.foo#frag
    2. http://mysite.foo
  18. Plus vs. Space in path
    1. http://mysite.foo/url+design
    2. http://mysite.foo/url design
  19. Space vs. Encoded Space in path
    1. http://mysite.foo/url design
    2. http://mysite.foo/url%20design
  20. Plus vs. Encoded Plus in path
    1. http://mysite.foo/url+design
    2. http://mysite.foo/url%2Bdesign
  21. Slash vs. Encoded Slash in path
    1. http://mysite.foo/top/second
    2. http://mysite.foo/top%2Fsecond
  22. Ampersand vs. Encoded Ampersand in path
    1. http://mysite.foo/abc&xyz
    2. http://mysite.foo/abc%26xyz
  23. Ampersand vs. Encoded Ampersand in parameter value
    1. http://mysite.foo/?q=abc&xyz
    2. http://mysite.foo/?q=abc%26xyz
  24. Equals vs. Encoded Equals in path
    1. http://mysite.foo/abc=xyz/
    2. http://mysite.foo/abc%3Dyxz/
  25. Equals vs. Encoded Equals in parameter value
    1. http://mysite.foo/?q=abc=xyz
    2. http://mysite.foo/?q=abc%3Dyxz
  26. Parameter order
    1. http://mysite.foo/?abc=123&xyz=987
    2. http://mysite.foo/?xyz=987&abc=123

P.S. Don’t stress if you can’t answer them all. It took me months to uncover all these nuances, and if I were taking this quiz I doubt I could get them right all in one sitting.

FootNotes

  1. I’m using the non-existent top-level domain “.foo” to avoid giving any link-love to arbitrary example sites that don’t deserve it! For the purpose of the quiz, just assume that “.foo” is a functioning top level domain.
  2. Question A.
  3. Question B.

URLQuiz #1: To .WWW or not to .WWW?

Monday, February 19th, 2007

As promised, this is the first of what will be many URLQuizes here are the blog for The Well Designed URLs Initiative. This URLQuiz discusses the convention of using a subdomain with the name ‘www‘ to identify a website.

As most everyone knows, many of the first sites on the web started using this convention. Examples include  www.amazon.com, www.yahoo.com, www.google.com, and www.ebay.com. However, there is nothing about the web that requires a subdomain be named ‘www‘ when selecting the address for a website. To the contrary, many websites use other subdomains for prefixes such as:

There is even a passionate contingent of web developers  that believe the ‘www‘ convention is an anachronism and should be deprecated (or ‘eventually abolished‘, in layman’s terms.)

So how should the base domain and subdomain(s) be handled, and what are the pros and cons of each? Here are the options I’ve identified, but feel free to suggest others that come to mind as well:

  1. Establish the ‘www‘ form as the implicit canonical form and issue a 404 - Not Found whenever an inbound request attempts to deference a URL using the root domain (i.e. without ‘www‘ or any other subdomain.)
  2. Establish the non-’www‘ form as the implicit canonical form and issue a 404 - Not Found whenever an inbound request attempts to deference a URL using the ‘www‘ subdomain.
  3. Establish the ‘www‘ form as the implicit canonical form and use a 301 - Moved Permanently (redirect)  whenever an inbound request attempts to deference a URL using the root domain (i.e. without ‘www‘ or any other subdomain.)
  4. Establish the the non-’www‘ form as the implicit canonical form and use a 301 - Moved Permanently (redirect) whenever an inbound request attempts to deference a URL using the ‘www‘ subdomain.
  5. Do not establish a canonical form and return 200 - Ok for both the ‘www‘ form and the non-’www‘ form.
  6. Abandon both the ‘www‘ form and the non-’www‘ form and always use explicitly subdomains based on your site organization like in the examples shown above.
  7. Some combination of 1 through 6 I haven’t already described.
  8. Or, something completely different?

So there you go; give your answer(s) in the comments. Though I definitely have my opinions on the subject I will stay out of it unless I don’t see anyone mentioning several of the points I think are relevant. After enough comments come in, I’ll summarize and write a follow up post, just like Dan Cederholm did with SimpleQuiz.

Hint: You might want to consider not only online usage but offline usage as well.

UPDATE: Just days after writing this post Tim Bromhead wrote: Which is better for your site: www or no www?  Is that weird or what? Tim must have had some kind of a Vulcan Mind Meld or similar going on… Anywho, great article Tim and thanks for being a URLian!

UPDATE#2: Looks like I picked the right time to discuss this issue! A few days ago Scott Hanselman talked about the downside of ignoring the distinction between ‘www’ and the root domain, Jeff Atwood discussed how to solve it, to which Phil Haack then responds with a bit of a rant about the www or lack thereof. Since they both have such strong yet opposite opinions on the subject, maybe we can get both Jeff and Phil to weight in on the subject over here…?

Technorati Tags: URL Design | Subdomains | Canconical Form | www | no-www

Which is Worst: the URL for IE7 Add-ons, Firefox Extensions, or Greasemonkey?

Friday, February 2nd, 2007

I am working on a project that had me was writing about browser plug-ins and I needed to link to the main page for Microsoft’s Internet Explorer Add-ons, for Firefox’s Extensions, and lastly for Greasemonkey for Firefox

I actually looked up those three in opposite order than I have them listed above. Greasemonkey’s URL was pretty good although it’s a shame it’s not greasemonkey.com/.net/.org; the .com resolves to a 403 forbidden page, the .org resolves to a list of advertising links, and the .net resolves to Grease Monkey International, a franchiser of automotive preventive maintenance centers! Whatever the case, I feel pretty good that this URL is going to have really good persistence. It should be around at least as long a Greasemonkey is relevant if for no other reason than to return a 301:

http://greasemonkey.mozdev.org/

The second URL for Firefox extensions was not so good, but I still think there a pretty good chance it will still resolve a year from now:

https://addons.mozilla.org/extensions.php?app=firefox

Then there is Microsoft’s horrific URL for Internet Explorer Add-ons.  What were they thinking?  I’ll bet this URL doesn’t resolve three months from now let alone in a year of five:

http://www.windowsmarketplace.com/category.aspx?bcatid=834&tabid=1&WT.mc_id=0107_20

URLs like this one from Microsoft are a crying shame. Sadly, Microsoft is one of the few companies that can get away with this without be negatively affected. On the other hand, most companies haven’t a clue how bad URLs like this can affect them.

That said, I’d love to get your input:

  1. Why is Microsoft’s URL so bad?  Help me find and explain all the reasons why companies should care not to be so careless when designing their URLs. Why is it bad for users, and why is it bad for Microsoft?
  2. Design the Ideal URLs. Assume you have no constraints at all – no badly designed content management system and no inflexible server technology — and suggest the ideal URL for each of the above three resources. Heck,  you can even change domain names if you want to. So what would be the best URLs for each of the three above?

Help Expose the URLs that Suck!

Wednesday, January 17th, 2007

Something that would be fun would be to see a list of the top 10 worst designed URLs on the web, a.k.a. URLs that Suck! You know the type, you cringe when you see ‘em, especially if you frequently use their site.

So I’m calling on you to help in this quest. If you see a really bad URL, either place a comment on this page, or better yet, tag it with the following on del.icio.us:

urls-that-suck

If you want to tag a specific URL, please do that. But if you want to indicate the entire site’s URLs suck then tag their home page as well! Tag as many of a site’s different types of URLs that suck , but be fair to the site and only tag one of each type as we don’t want to overrepresent the suckiness of a site’s URLs.

And what’s more, tell your friends about it. It’d be great to get an army of URLians to keep an eye out for those really horrific URLs. Together we can shine a light on these abominations and give their owners a reason to bow their head in shame! ;-)

As soon as I get a statistically significant sampling[1] of sucky URLs, I’ll announce my pick for the Top 10 “URLs that Suck” Awards; probably at the end of 2007, but maybe sooner. I’ll also honor the top ten contributors and as well showcase anything or anyone that’s a standout for any reason. And if anyone is so inclined, talk to me about sponsoring these awards; you know you wanna!

If you are onboard for this, please leave a comment letting us know you plan to be on the lookout. And to kickstart this, I will nominate the very first site that has some truly awful URLs that Suck!:

Amazon.com

P.S. And if they are any excellent graphic designers out there who would like to help out with the cause, how about designing some badges that bloggers could post on their website linking to this post with a catchy slogan, something like “Help Stamp Out URLs that Suck!” or “Expose Sucky URLs”, or better; you get the idea!

  1. What is a “statistically significant sampling?” I don’t know yet, but I’ll know it when I see it. ;-)

 

Intro, Part 15: About URLQuiz

Wednesday, January 10th, 2007

As a way to gather a broad spectrum of opinion and engage the community on many different URL-related topics, we’ll be offering a URLQuiz from time to time. Inspired by and patterned after Dan Cederholm’s very well received SimpleQuiz series, we’ll take a simple URL Design question, present a few different alternate approaches, and ask readers to weigh in on which they believe to be the best and why.

Dan proved that his SimpleQuiz was very effective at quickly emerging a consensus for a best practice when a consensus was possible. Building on Dan’s pioneering efforts, we hope to leverage his technique to address the constantly debated issues in URL Design and hopefully arrive at some consensuses of opinion on URL-relates issues ourselves.

So look for the first URLQuiz, part of an ongoing series, here at the Well Designed URLs Initiative blog in the near future.