In the past, any special characters needed to be specially encoded. It isn’t as necessary today, as long as you specify a character set and make sure that you always stick with that character set. But entity codes can still be useful if you’re working with multiple data sources of unknown character set, or you have multiple people working on the same document and you’re worried that they won’t all use the same character set.
Ellipses and m-dashes
Two common special characters are m-dashes and ellipses. There are two ellipses in this document and one m-dash. The ellipses are currently three periods, and the m-dash is two smaller dashes. Change them to … and —, respectively.
Entities (special characters): Quotes
The most common special characters you’ll use are typographer’s quotes, since they improve the readability of your text.
Left double quote | “ | “ |
Right double quote | ” | ” |
Left single quote | ‘ | ‘ |
Right single quote | ’ | ’ |
You may find the search and replace feature of your text editor useful for this, but go ahead and change all of the straight quotes to the appropriate typographer’s quote. Hint: most single quotes are right single quotes, since they’re used for contractions. Warning: make sure you don’t change the straight quotes used inside HTML tags to mark attribute values!
<p>In grade school, I used to look forward to those “educational” films about faraway places or road safety, good or bad. Some of the good ones might have been made by Herk Harvey. Criterion includes several “educational” films directed by Herk for Centron, and about four of his commercial films to give you an idea of what he was doing before and after “Carnival of Souls”. They range from “Signals: Read’em or Weep” (my favorite) to promos for Korea, Jamaica, and Kansas itself where Centron was located. The Kansas promo (Star 34, after Kansas’s star on the U.S. flag) is especially interesting because it shows a very young Herk Harvey, when he had just started working for the Centron Corporation.</p>
Other characters
Other commonly encoded characters are accented letters and other diacritics.
Letter | ´ accent | ` accent | ¨ umlaut | ˆ circumflex |
a | á | à | ä | â |
e | é | è | ë | ê |
i | í | ì | ï | î |
o | ó | ò | ö | ô |
u | ú | ù | ü | û |
You can capitalize the first letter to get the capitalized version of that letter.
Others include “ç” for ç and “ñ” for ñ.
Entities (special characters): Ampersands
Because special characters are encoded using the ampersand, the one special character you do always have to worry about is the ampersand. You should never have a “bare” ampersand in your web pages. All ampersands—even the ones in URLs—must be encoded using “&”.