Archive for the ‘Education’ Category

Seeing things the way in which one wants them to be (not the way they are)

Saturday, May 19th, 2007

DabbleDB’s Really Bad URLsThroughout history there have been people who only saw things as they wanted them to be. People with strongly held beliefs whose values guided their actions be they counter-productive, detrimental, or worse just plain wrong; nothing mattered but to believe the world was as they wanted it to be. And it’s not just those from the past that are guilty; nay it seems that people the world over are now more ideological than any period in my own lifetime. I’ve give many examples, but whatever examples I gave I’d be sure to offend over half of my readers!

It’s probably pretty obvious that I’m talking mostly about war and religion in the above but it’s also sad to see the same from technologists. Case in point: the creators of DabbleDB and the Seaside Framework. There is an unfortunate school of thought among some web developers that Avi Bryant evidently shares[1] that clean URLs are simply not important, that they are just the obsession of overly pendantic developers pursing unimportant elegance. And those opinions are often rationalized by statements like these (on Mike Pence’s blog) that clearly exhibit confirmation bias:

“I have not had one single person ever mention in Dabble DB that the URL’s look funny. People are used to it. If you go to Amazon.com, the URL’s have all kinds of opaque identifiers in them. It is just not something that the average user cares about. I think it becomes an obsession for developers to have this sense of having a clean API exposed by their web application, but I think you can have a clean API that does not have to include every single page in your app, and I don’t think that every single page in your app has to be bookmarkable. I think that as long as a bookmark gets you back, roughly, to where you wanted to be, or for really crucial things to have permalinks, then you are fine.”

Well I guess Ari can’t say that anymore (that he hasn’t had one single person complain about DabbleDB’s URLs.) 

That said, why would Ari believe URLs to be unimportant anyway?  There is significant evidence all over the web that URLs are important, not the least of which has been document on this blog in the past. As best I can tell Ari’s regrettable belief occurs because of his desire to be unburdened from dealing with web architecture so that he can hoist highly stateful web apps onto an unknowning and unsuspecting public simply because that’s what Ari values. Basically Ari chooses to ignore the importance of URL design for both users and good web architecture and have his framework emit simply awful URLs simply because doing so makes coding and using his server-side framework so much easier. That’s similar to someone not addressing the unfortunate necessity of security simply because dealing with security is a PITA. (BTW, Amazon’s URLs are some of the worst and they only get away with it because of their early momentum. They are NOT a good example to emulate.)

So you see DabbleDB exhibits some very clear examples of really bad URLs. To see for myself I created a free trial account over at DabbleDB, which gave me my own well-designed URL (itself, not bad):

http://welldesignedurls.dabbledb.com/

Next I created an application called “Sites” and a first “category” that I named “Domains” (evidently in DabbleDB parlance a “category” is like a table to us relational database types.)  This gave me the following URL:

http://welldesignedurls.dabbledb.com/dabble/sites?view=2&_k=ZEiTkHyn

Not bad, but the “/dabble/” is unnecessary, the “view” could have been defaulted, and the “_k” is, well, is so gratuitous I doubt I need even further criticize it.  Clearly what I would have preferred to see is this:

http://welldesignedurls.dabbledb.com/sites/

Or at least:

http://welldesignedurls.dabbledb.com/apps/sites/

And I believe anyone would be hard pressed to explain why the actual URL DabbleDB uses is better or why the URL I proposed would not be workable. Still, all is not so bad to this point because it appears DabbleDB will respond appropriately to:

http://welldesignedurls.dabbledb.com/dabble/sites

Of course anyone bookmarking the URL vs. composing the URL for a blog will be linking to two different URLs as per web architecture, which has its own perils for the owner of the website. (I’m of course assuming public URLs for this use-case which is possible via DabbleDB Commons, itself having a great URL of http://dabbledb.com/commons, but many usability problem still exist in closed environments how most DabbleDB databases will be used.)

But matters get much worse when we drill down into the “Domains” category I created. Compare the following two URLs and guess which one I envisioned vs. the one Dabble generated:

http://welldesignedurls.dabbledb.com/dabble/sites?view=2&_k=qPDotnwm

http://welldesignedurls.dabbledb.com/sites/domains/

And if we click on the name of the domain “welldesignedurls.org“, if gets even worse:

http://welldesignedurls.dabbledb.com/dabble/sites?entry=7&view=2&_k=jGMmkZyZ#objectEditor

Again, I would have liked to have seen:

http://welldesignedurls.dabbledb.com/sites/domains/welldesignedurls.org/

Why is this important?  Because cognition of the meaning of the URL is used in a significant number of contexts by humans, often in the context of where only recognition (vs. URL construction) is important. In email, in the browser history list, in older bookmarks, on printed communication, and more.  By analogy imagine how much harder computers would be to use if users had no choice but to always navigate the tree structure of a deeply nested directory instead of simply copying and pasting the path from, for example, Windows’ Explorer or the Mac’s Finder to an Open File dialog[2]. Just imagine what it would be like if a path to the user’s directory was named “C:\%GSkstyrWshs\@9KBHasklp\” Ye-Gads!)

There are still further cases where clean URL design is important. For bloggers composing their links having the ability to learn a link structure rather than having to navigate to each page they want to link (such as on Wikipedia) is invaluable. For marketers wanted to convey a location in advertising for their customers and prospects to visit. And especially for users of web that are heavily data-oriented where users are involved in editing, navigating to, and communicating various application states (a.k.a. web pages) to their colleagues, such as an app like DabbleDB. If ever there was a category of web apps where good clean URL design is critical, it would be online databases!

So NO Ari, URL Design IS important. I hope you can learn this and make changes to DabbleDB and Seaside before it’s too late for you, and worse, for your users.

Footnotes

  • 1.) How ironic Avi would name his blog “HREF Considered Harmful” as HREFs are truly one of the core foundations of web architecture.
  • 2.) Yes I know that some people don’t ever ccpy and paste paths but many of the more intelligent and/or aware users do.

Why URL design matters in email

Friday, March 30th, 2007

I’ve long believed email provides one of the better justifications for good URL design. Having a well designed URL structure inspires a user to have faith in a site’s URL integrity making it more likely then will email a URL to their friends. What’s more, a good URL gives hints to what can be found making it more likely for an email recipients to visit the link. And a readable URL provides something to “google” when the emailed URL is mangled or simply mistyped by the sender.

But it simply hadn’t occurred to me just how important URL design can be for marketing emails until today when I read a post today by Mark Brownlow of Email Marketing Reports. Mark’s post, entitled Forget email design, what about URL design? discusses the immediately obvious benefits of URL design in email marketing and lists several reasons why email marketers should pay particular attention to their URLs.

As Mark effectively states, well designed URLs can (elaborations mine):

  • Reinforce a brand message (when a good domain and/or logic URL path is used),
  • Help orientate the reader (within the website’s structure, and/or regarding the offer),
  • Provide text clues to the destination page’s content and value,
  • Indicate important content relationships (via the URL path’s heirarchy and/or between multiple emailed URLs), and
  • Remain relevant and recognisable over a long period of time (assuming the email marketer has a process in place to manager their site’s URL architecture.)

In addition Mark also gives a few examples that clearly make the case for good URL design in email marketing. He effectively asks which of these two URLs send a stronger message to the prospect?

  1. http://www.brandk.com/land.php?123456
  2. http://www.brandk.com/rings/coupon/

I think the preferrable one is obvious, don’t you?

Mark also suggests providing your prospect with their own custom call-to-action URL in marketing emails such as:

http://www.brandk.com/rings/coupon/justformark/

I too believe that providing customers with their own personal well designed URL can be an incredibly powerful marketing and SEO strategy. However, I’m not so sure it will work well for unknown prospects.

Well done Mark. Nice to have another URLian on the bandwagon.  :-)

URL Quote #2: Think about your website’s “public face.”

Saturday, March 3rd, 2007
“…one should take an hour or so and really think about their website’s ‘public face.’”

-Scott Hanselman on “A Website’s Public Face

PageRank

Monday, February 26th, 2007

SEO: Illuminating the value of URL design

If you’ve read many of our other posts here at The Well Designed URLs Initiative, you know that we are strong advocates for User-Centered URL Design as well as for URL Literacy. It’s our contention that the URL is woefully under-appreciated as the most fundamentally important technology of the web, more important than HTTP, and even more important than HTML. The purpose of this post is to provide background for future posts explaining URL design importance from a perspective most website owners can appreciate; search engine rankings!

But they’d have to kill you

For those unfamiliar with Google’s core algorithm for determining its search engine results, you can read this article to learn about PageRank in more painstaking detail. Here, I’ll just try to explaining the aspects of PageRank as it relates to URLs. Note also that my explanations are simply meant to be a conceptual guide and not exacting details. The founders of Google did publish their initial algorithms but have since made tweaks that are as closely held a corporate secret as the formula for Classic Coke!

Popularity is the key

PageRank ias essentially a popularity rating, and a page’s PageRank is determined by the inbound links from other pages on the web. A PageRank can be as low of almost zero (0) to as high as ten (10). Google’s algorithms determine a page’s PageRank by dividing the PageRank for each of the inbound linking pages by the number of outbound links on each of those pages, factoring in each page’s PageRank, and then summing the results for all inbound links. Clear as mud, right? It’s easier to explain with an example, but let’s cover a bit more ground first.

Like voting company shares

PageRank considers each link a ‘vote’ for the page linked to. But unlike in a democratic “one citizen, one vote” society, Google’s algorithm more closely models the shareholders of a corporation voting their shares; the votes of those with “more” (PageRank or shares) have a greater influence on the outcome. So a link from a page with a PageRank of 7 is more valuable than a link from a page of PageRank 3; probably many orders of magnitude more valuable, as you’ll see next.

The old 80/20 rule, on steroids

Because of the nature of the web, a small number of pages have a huge number of inbound links, and vice versa. So those with more links get more PageRank, but the value of PageRank is on a logarithmic scale thus it increases exponentially. Assuming[1] that the base were five (5), the value a page would get to vote based on it’s PageRank would look like this:

PageRank Value
0 0
1 5
2 25
3 125
4 625
5 3,125
6 15,625
7 78,125
8 390,625
9 1,953,125
10 9,765,625

An example:

Assume a site somehow manages to get a persistent link from MySpace’s home page (www.myspace.com). At the moment contains MySpace’s home page contains about 70 outbound links and has a PageRank of seven (7). Let’s also assume that there are a total of 50 other inbound links, and let’s say the average PageRank for those pages linking in is three (3) and those pages have an average outbound link count of 10. From this, let’s calculate PageRank:

MySpace’s Available PageRank per outbound link:
78,125 / 70 => 1,116
PageRank value contributed by 50 other sites:
125 * 50 / 10 => 625
Total PageRank value:
1,116 + 625 => 1,741

Looking it up in the table, the resultant PageRank for the home page is four (4).

The Three ‘P’s of Inbound Links

As with the three ‘L’s of real estate, the three ‘P’s of inbound links are: PageRank, PageRank, PageRank! [2] Note how in the prior example the 50 inbound links of PR3 offered less PageRank than the one (1) inbound link from MySpace with PR7! Of course we don’t know the logarithmic base [1] but Phil Craven says 5 or 6 are what many people believe it to be.

Here is what it would look like with base two (2) through ten (10) (download the full calculations here as a zipped Excel 2003 file [4kb]):

Logarithmic
Base
Value from
PR3 * 50 / 10
Value From
PR7 * 1 / 70
Total
Value
Resultant
PageRank
2 40 2 42 5
3 135 31 166 4
4 320 234 554 4
5 625 1,116 1741 4
6 1,080 3,999 5,079 4
7 1,715 11,765 13,480 4
8 2,560 29,959 32,519 4
9 3,645 68,328 71,973 5
10 5,000 142,857 147,867 5

So depending on the logarithmic base, PageRank fluctuates between four (4) and five (5) for this hypothetical example. However, starting with a logarithmic base of five (5) the one MySpace link overpowers the 50 others! And because pages with a PageRank closer to 10 are listed higher in Google’s search engine results page among competing pages, people focused on SEO are always trying to increase their page’s PageRank, often via unscrupulous means.

Of course nobody outside Google knows the exact formula or base exponents used, but hopefully this post illustrates the value of links from high PageRank pages.

Don’t game the system

However, I would be remiss if I didn’t point out that a single-minded focus on inbound links is fraught with peril, not the least because it might cause your pages to removed from Google’s index! Just as there are people selling weight loss products they claimn don’t require dieting or exercise, there are people offering ways to inbound links that don’t require having real people link to you. However, Google considers these shortcuts to be gaming the system is ever vigilant to discover those cheaters. If caught cheating, Google will ban your pages from their index without notice.

The best way to gain inbound links for your key pages on your website is to do the hard work of creating a site with great content that people want to link to.

Architecture Matters

So as an epilogue, getting inbound links is clearly necessary for high PageRank and thus good search engine results, but all those inbound links can be squandered without a good architecture and site management plan. To ensure that a site’s great content and popularity get reflected in appropriately high search engine ranking it’s critical to optimize the architecture of the site, the pages, and the URL structure as well as make plans for how the URL structure might change over time.

The most under-realized aspect of SEO

Personally, I think the most under-realized aspect of white hat SEO [3] is the lack of attention paid to URL planning and design, especially for larger websites. There are very few tools [4] besides the low-level and effectively simplistic URL rewriters like mod_rewrite for creating and maintaining a URL plan, very few articles [4] that discuss URL design, and no articles [4] I am aware of that discuss URL planning.

However, I believe website owners will see huge improvements compared to their prior rankings if they focus on URL design and create a URL management plan. The good news is that URL design is mostly a one time endeavor assuming site maintainers adhere to the management plan, at least until there is a full site rearchitecture.

But all of the whys and wherefores regarding URL planning and design are beyond the scope of this post, and instead will be the subject of many posts in the future. Stay subscribed!

For Further Research

And, as I stated at the start of this post you can learn more the PageRank formula here, and you can also google for PageRank to get a large list of other resources.

Footnotes

  1. But remember it’s a secret, so we can’t know for sure.
  2. There’s more to search engine ranking than just PageRank, like applicable content, but PageRank differentiates pages that compete competitive for the same keywords.
  3. To those SEO-haters of the world, please note that I’m referring to those things that you can do with pure white hat techniques, things that if not done can result in a great site being given less credit by the search engines.
  4. Over time, I plan to address the lack of such articles and tools for URL planning and design.

Technorati Tags: | | |

Best Practice: Always ID your Heading Tags

Thursday, January 18th, 2007

Here’s a simple best practice. Always ID your heading tags! For example, if you’ve got an <h2> element, be sure to make it <h2 id=”some-heading”>.

IDing heading tags is especially important on long documents.

Why? Because if you don’t, someone else can’t reference the part of the document that they want to reference in a blog post or somewhere else. And if they can’t, they just might reference someone else’s web page instead. Or if they do reference it, readers who click over to your URL might give up on reading before finding the appropriate document, and never come back to your site when they might otherwise have become an avide reader. How often have you see a link to a web page where the person linking included the text “Scroll down to the section entitled…“  Bleach!

Given the heading tag mentioned in the first paragraph above, and assuming it was contained in a document entitled “whitepaper” in the root of www.foo.com, you can point straight to that heading using a URL fragment like so:

http://www.foo.com/whitepaper#some-heading

Ben Coffey talks about this same problem over at URLs for Specific Portions of Documents.  He also talks about CiteBite which helps bloggers and others link directly into a part of a document as if there had been an ID there. But publishers, if others start using CiteBite on your content simply because you don’t include the ID attributes they need to link to your directly, guess who will get the Google PageRank?  Not you… ;-)

One more thing. If you are creating content that will be displayed above or below other content, i.e. blog posts that get listed with other blog posts on the same HTML page, you’ll need to make sure your IDs are unique. I personally have started using a convention that appends the date in “YYYYMMDD” format to the end of a meaningful fragment, seperated by a dash, as in:

http://www.foo.com/whitepaper#some-heading-20070118

This tends to work for me because I almost never post more than once per day. Also, though I personally dislike the inclusion of dates in URLs because of how difficult it makes things for users to remember or discover the URLs, having the date as a fragment suffix is not quite at bad. People using the browser URL auto-complete can still easily find the URL they visited recently enough that its URL is still in the browser’s cache. YMMV.

Lastly, if you are going to ID your heading tags, you probably should also create a table of contents. ‘-)

Lessons Learned from Delicious Praise

Tuesday, January 16th, 2007

I learned an interesting lesson about URLs and social software; change your article’s URL or publish it under multiple cases, even with 301 redirects, and you’ll end up with fragmented references.

Back in August 2005 I published the article Well Designed URLs are Beautiful on my personal blog and have since had well over 400 people tag it on del.icio.us. Unfortunately during that same time, I’d changed my blog configuration twice to improve its URLs with one more change planned. Of course for each change I made sure the old URLs were redirected with a 301. Unfortunately, these chances resulted in my article being tagged from three different URLs, and hence splitting up it’s ranking on del.icio.us three ways!

So, when changing URLs, be careful. You could harm yourself in the process.

P.S. As an aside, this speaks sadly to ones ability to reorganize a poorly organized website, notwithstanding the fact that Cool URIs (aren’t supposed to) change. I can understand why del.icio.us and others don’t automatically test their links for 301s on an ongoing basis because of the resources required, but delicious and others could offer a “check me” test for links. This would even give them a legitimate reason to periodically email their users and say “Hey, here are all the links that moved!” That email could even ask “Should we keep the changes?” to guard against spammers, though I don’t think this would be a big issue. It is hard enough to get delicious links, how would spammers get them to begin with?

Anyway, I hope to see things evolve in the future where changing links won’t be such a big issue. And yes, sometime in the reasonably near future I plan to be an advocate for some specific plans to enable this.

Ben Coffey and his eBook “Useful URLs”

Thursday, December 28th, 2006

In the hustle and bustle of these holidays, I didn’t manage to churn out all of my planned posts for our introduction series. So with nothing else to publish today it seems only fitting I introduce you to the works of someone I met recently. His name is Ben Coffey and he’s an absolutely brillant kid[1] from the U.K. who shares my passion for User-Centric User Design.

His work that I spoke of is his ebook entitled “Useful URLs” It is available for download at his website inelegant.org in both HTML and PDF formats. Further kudos to Ben for he licensed Useful URLs using a Creative Commonsby-nc-sa“ license meaning you’re free to copy and distribute it as long as you attribute him as author, you don’t use for commercial purposes, and you share any of your own modifications or additions with the same license.  Ben says that Useful URLs is a work-in-progress so he plans on updating it continuously. I’ll be sure to make mention here when he updates it, especially if he adds new chapters.

That said, Useful URLs is both a quick read and also:

Highly Recommended!

  • Sorry for calling Ben a kid, but at 23 he’s barely more than half my age, and I still feel relatively young! But the more I talk to him, the more I think I barely know half what he does, at least with respect to programming and the Internet! I’m actually afraid to talk to him about anything else for fear I’ll learn that I barely know half as much on those topics as well!