The dash is probably the single most abused piece of punctuation in the English language. No, I mean it.

Dash abuse. It's a serious crime, with serious consequences. I'm guilty of it; you might be guilty of it. We spend years—lifetimes—figuring out when to use commas, colons, and semicolons. We learn to do nonsensical things like putting the comma inside the quotations because, well, that's the way it's done.

We spend a lot of time and energy getting grammar and punctuation right. But when we need a dash (whether of the em or en variety), a hyphen, or even a minus sign, eh. We just mash the one-size-fits-all "dash/hyphen/minus/whatever" button next to the zero key.

We might even get crazy and press it two or three times --- like this --- and we might squirt in a couple of spaces on either side, just to make it clear that we mean business. It's what you could call a descriptivist, anything-goes approach to punctuation, and it means there are about a dozen ways to represent the following Einstein quote:
- I never think of the future-it comes soon enough.
- I never think of the future--it comes soon enough.
- I never think of the future---it comes soon enough.
- I never think of the future–it comes soon enough.
- I never think of the future––it comes soon enough.
- I never think of the future—it comes soon enough.
Those are all "closed" dashes, meaning they aren't surrounded by spaces. You can also use open dashes, like this:
- I never think of the future - it comes soon enough.
- I never think of the future -- it comes soon enough.
- I never think of the future --- it comes soon enough.
- I never think of the future – it comes soon enough.
- I never think of the future –– it comes soon enough.
- I never think of the future — it comes soon enough.
And for that matter, you don't have to use a full space on either side of the dash; you can use what's called a hair space.
- I never think of the future - it comes soon enough.
- I never think of the future -- it comes soon enough.
- I never think of the future --- it comes soon enough.
- I never think of the future – it comes soon enough.
- I never think of the future –– it comes soon enough.
- I never think of the future — it comes soon enough.
The difference may or may not be readily apparent, but it's there. That's a total of a dozen eighteen separate separate ways to render a freaking dash, and that's not counting the minus sign which is (of course) a separate glyph. Some of these dashing styles are correct, most are incorrect, some are still being debated...and we bloggers and e-authors use them all.
It's positively dashtardly.
So listen up, fellow dash-abusers. All dashes were not created equal. Even though the typical keyboard has a single dash/hyphen key, there are at least five different species of dash:
That's not counting other symbols which look like dashes:
Each of these has a precise meaning and a separate dedicated glyph.
Now in the old days, the single-byte ASCII character set only provided for exactly one type of dash: the hyphen or "hyphen/minus", decimal value 43. If you needed a dash, a hyphen, or a minus sign, you just used ASCII 43. And sometimes, in order to make your dashes stand out more, you'd line two of these puppies up together in a sort of double-hyphen dash:
I never think of the future -- it comes soon enough.
This style was acceptable when we had no other options. It was the best practice for representing longer dashes when the character set had no real support for them. It even found its way into a few manuals of style. The Wikipedia article on the dash explains why:
Typewriters and computers have traditionally had only a limited character set, often having no key with which to produce a dash. In consequence, it became common to substitute the nearest incorrect punctuation mark or symbol. Em dashes are often represented by a pair of spaces surrounding a single hyphen-minus (typical British usage) or by a pair of spaces surrounding two hyphen-minuses (mostly in the United States).
Now fast forward 20 years.
Any kind of dash can manifest directly in an HTML document, but HTML also allows them to be entered as character entity references. The entity names for the em dash and the en dash are mdash and ndash; therefore, they can be referenced in HTML as — and –. The equivalent numeric character references are — and –. Nearly all web browsers and operating systems used today are capable of rendering the numeric form, and almost as many correctly display the named form.
Hello Unicode.
Hello HTML.
Hello capable web browsers.
Hello semantic web.
On the modern web, or in any writing of any sort other than a README.TXT file, the once-useful double-hyphen dash is inappropriate if not downright evil.* Modern word processors will actually abort the double-hyphen dash at the moment of conception, replacing it on the fly with a proper em dash:

What the word processor is trying to tell you is that there's no longer any reason to manhandle hyphens as dashes. Any modern word processor or browser you care to mention is going to have support for (at the very least) the em dash and the en dash, which is all the dash most people ever need. Within HTML, you can input true dashes directly even if your editor doesn't support them outright. And yet, for some reason, the hyphen-as-dash heresy still enjoys a wide following. Sites like CNBC, Huffington Post, and Coding Horror all do the double-hyphen forbidden dash lambada and they are by no means alone.
"We think we can have a successful U.S. auto industry. But it's got to be one that's realistically designed to weather this storm and to emerge -- at the other end -- much more lean, mean and competitive than it currently is," Obama said.
“I'm delighted that today we are launching a new venture -- The Huffington Post Investigative Fund. This nonprofit Fund will produce a wide-range of investigative journalism created by both staff reporters and freelance writers.”
“I realize that using financial incentives on open source projects can have some unintended consequences. But a sort of attention and interest aggregation service for existing projects -- one backed by real money, so you know the interested parties are serious -- seems like a worthwhile cause. It might even attract the interest of other programmers if the pool got large enough.”
On the other hand, Google and Ars Technica (just to name a couple) get it right:
Making changes of this kind is never easy—and we recognize that the recession makes the timing even more difficult for the Googlers concerned.
Still, even a 3x speed increase will be welcome to existing iPhone owners updating to the new OS—the speed will be beneficial for both web-based applications and general surfing.
Now all this is, admittedly, a minor point. I don't want to be a dash-hole about it. After all, if it looks like a dash and quacks like a dash, it's a dash. Visually, dashes, hyphens, and minus signs are similar enough that the intended meaning is rarely lost. And we're not supposed to be using too many dashes anyway. And if the writing is decent and informative, who cares?
(By the way, don't even get me started on the subject of dashes/hyphens/minuses in numbers or ranges:
- On April 20–24, I'll be in Madrid (En dash)
- The Sox won the last game of the World Series, 7–4 (En dash)
- My phone number is 999‒999‒9999 (Figure dash)
- 4 - 2 = 2 (Minus sign)
Let's just not even go there.)
But it's a little bit of a disconnect when we crusade for "semantic markup" and "clean content" on the one hand, then turn around and substitute hyphens for dashes, dashes for minus signs, and minus signs for Social Security numbers on the other, probably swapping <p> for <br> while we're at it, so that when we write "123-456-7890" meaning "John Doe's phone number", some quasi-benevelont far-future deep-scanning Internet robot comes across our 200-year old missive during a random Internet archaeology dig and determines we were trying to subtract 7890 from 456 from 123, getting negative 8,223, convincing said robot that humans somehow pose a threat and inciting a Terminator-style genocide which ultimately results in the extinction of 9/10ths of the human race followed by a thousand-year period of technological Luddism dominated by a deep-rooted terror of Zytrax, the Horned Lizard-God of the Forest.*
Your blog or website is not a dash-ocracy. You are the dash-tator. You will make the dash-cisions around here, and you will suffer the dash-sequences. But since it's six of one and half a dozen of the other anyway, why not do things as cleanly as possible?
* - Slight exaggeration here, but that doesn't make the danger of imprecise markup any less real.
Posted by James Devlin 27 comment(s)





