I'm going to say this as simply as I can: web pages shouldn't have file extensions. Ever!*

The Internet is full of best practices: how to structure your web pages, how to handle redirection properly, how to design friendly URLs. For example, we know that the following URL...
http://www.somesite.com/products/videogames.htm
...is better (in every way) than this one:
http://www.somesite.com/products.htm?category=234398304&sessionid=029384029348
But both URLs suffer from another problem: they contain a file extension. Including a file extension in your public URLs isn't a best practice, it's a worst practice - or at least, a worse practice. And it's an epidemic one. Today, the vast majority of web pages serve .htm files, .aspx files, .jsp files - whatever the flavor of the month happens to be. And it works because HTTP is largely an agnostic protocol: it doesn't much care about things like file extensions, or even files. In fact, the following is a perfectly valid HTTP request:
http://www.somesite.com/marypoppins.supercalafrajalisticexpialadocious
Does it point to a file of type "supercalafrajalisticexpialadocious"? Or some other kind of resource, perhaps one that's generated from a database? HTTP doesn't know and it doesn't care. It simply requests a particular resource, and receives a response.
So, if HTTP doesn't care about file extensions, and browsers don't (usually) care about file extensions, why should we? What's so bad about having an innocuous little ".htm" or ".aspx" in our URLs?
For starters, let's look at what Tim Berners-Lee said on the subject in a classic article I've quoted before:
What to Leave Out [of your URLs]
Everything! After the creation date, putting any information in the name is asking for trouble one way or another.
- .........
- File name extension. This is a very common one. "cgi", even ".html" is something which will change. You may not be using HTML for that page in 20 years time, but you might want today's links to it to still be valid. The canonical way of making links to the W3C site doesn't use the extension.(how?)
- Software mechanisms. Look for "cgi", "exec" and other give-away "look what software we are using" bits in URIs. Anyone want to commit to using perl cgi scripts all their lives? Nope? Cut out the .pl. Read the server manual on how to do it.
- .........
And it's a valid point - who knows how long any of these technologies, with their specific and often proprietary file extensions, will be around? Even the sacrosanct .HTM extension can go extinct - for example, as more and more people switch to ASP.NET, JSP, PHP, or other dynamic content technologies. And if .HTM files can go extinct, you'd better believe that ASPXs, JSPs, and PHPs can too.
Nor is URL longevity the only thing at stake. Perhaps you don't think your pages will be around in twenty year's time, or a hundred. Maybe you don't care. But consider some of the other potential downsides:
File extensions are ugly. There's no reason in particular to confront your users with them. Sure, everybody knows what an .HTM file is. But how about .ASHX or .JSP? Do we expect every user to somehow understand Aha! JSP, that's a JavaServer Page! They're serving dynamic content using JSP! Yeee haw!
File extensions are irrelevant. A file extension contributes zero useful information. It doesn't help identify a particular page, or distinguish it topically from other pages. It doesn't extend your site's identity or branding in any way. It's an irrelevant implementation detail hacked onto the end of what could otherwise be a truly clean URL.
File extensions give away implementation details. In the lingo of object-oriented analysis and design, we'd say that file extensions violate encapsulation. They tell the world, "this is the technology I'm using, underneath the hood."
File extensions make life difficult if you ever decide to switch technologies. Let's say you've assembled a content-rich website with hundreds or thousands of pages, all tagged with a .jsp extension. You spend countless hours marketing your site, getting people to link to you, establishing a position in the search engines. Until one day, you decide it's time to make the leap to ASP.NET (or any other technology). Now you're faced with two ugly choices:
- Change the extension of all URLs from ".jsp" to ".aspx", invalidating all your incoming links, wreaking havoc with your search engine ranking, and probably forcing you to 301-redirect every page on the site to the "new" version.
- Somehow get ASP.NET (or whatever technology you're working with) to work with .jsp files. This is possible, if you know what you're doing, but is it clean? Can it be considered anything other than a kludge to make your ASP.NET-generated content masquerade as a .JSP?
File extensions, in other words, are EVIL. By avoiding them, we not only renew our allegiance to all things good, we separate ourselves from the millions of sites that have hardcoded their allegiance to a particular technology. There's a reason why Wikipedia and the W3C, among others, use mostly extension-less URLs. We could do worse than to follow their example.
Thanks for reading, and remember, the Devil's in the details.
* - Okay, well not necessarily never. Usually, though. All other things being equal.
Posted by James Devlin 13 comment(s)
Subscribe via RSS
Subscribe via email