Diacritical Mark

Auxiliary symbols added above or below characters. Indicates pronunciation differences such as accents and umlauts.

Diacritical marks are auxiliary symbols added above, below, or beside characters. Common examples include French accent marks (é, è, ê), German umlauts (ä, ö, ü), and the Spanish tilde (ñ).

In Unicode, characters with diacritical marks can be represented in two ways: as "precomposed characters" (NFC) or as "base character + combining character" (NFD). For example, "é" can be U+00E9 (precomposed) or U+0065 + U+0301 (decomposed). Character encoding technology books cover this in detail.

This dual representation causes issues in string comparison. Characters that look identical may have different byte sequences, requiring Unicode normalization (NFC/NFD) for consistency.

For character counting, NFD-form text counts base characters and combining characters separately, resulting in higher counts than visible characters. Multilingual programming guides provide additional context.