Hyphens and Dashes

Horizontal line symbols used in text for joining words, indicating ranges, and setting off parenthetical phrases. The hyphen (-), en dash (-), and em dash (-) look similar but are distinct characters in Unicode.

Hyphens and dashes are horizontal line symbols that are easily confused due to their visual similarity. Unicode defines them as separate characters: the hyphen-minus (U+002D), the hyphen (U+2010), the en dash (U+2013), the em dash (U+2014), and the horizontal bar (U+2015). Each serves a different purpose, and using them correctly is a mark of typographic quality.

The hyphen-minus "-" (U+002D) is the only horizontal line character in ASCII and can be typed directly from the keyboard. In programming it doubles as the minus operator, and in URLs and filenames it serves as a word separator. Originally a compromise character that combined the roles of hyphen and minus sign, in practice it covers the vast majority of use cases.

The en dash "-" (U+2013) is used to express ranges (e.g., 1990-2000, pp. 10-20) and contrasts (e.g., Tokyo-Osaka). The em dash "-" (U+2014) marks parenthetical insertions or abrupt breaks in a sentence. In formal English publishing these distinctions are strictly observed, but on the web and in everyday writing the hyphen-minus is commonly used as a substitute for both.

In Japanese typography, the full-width hyphen "-" (U+FF0D), the prolonged sound mark "ー" (U+30FC), and the double em dash "--" (two U+2014 characters) are used instead. Standard Japanese typesetting calls for dashes that are one em (full-width) or two em wide. The half-width hyphen-minus is too narrow for Japanese body text, so full-width dashes and hyphens are preferred.

From a character-counting perspective, all of these horizontal line characters count as a single character, but their byte sizes differ. The hyphen-minus (U+002D) is 1 byte in UTF-8, while the en dash (U+2013) and em dash (U+2014) are each 3 bytes. When a character limit is enforced on a byte-count basis, the type of dash you choose affects how many bytes are consumed.

In programming, whether a hyphen-minus can appear in an identifier depends on the language. CSS class names and property names allow hyphens (kebab-case), but JavaScript variable names do not. In URL paths, hyphens are recommended over underscores for SEO purposes, and domain names may contain hyphens as well, though not at the beginning or end. Typography books on Amazon explore these conventions in depth.

Share this article