Combining Character
A Unicode character that combines with the preceding base character for display. Includes diacritical marks and dakuten.
A combining character is a Unicode character that is not displayed independently but combines with the preceding base character to form a single visible character. Combining diacritical marks (U+0300-U+036F) are the most common examples.
For instance, Latin letter "a" (U+0061) followed by combining acute accent (U+0301) displays as "á". In Japanese, combining dakuten (U+3099) and combining handakuten (U+309A) are examples. Unicode programming guides cover this topic in detail.
Due to combining characters, visually identical characters may have different code point sequences. Unicode normalization (NFC for composition, NFD for decomposition) is needed for consistent handling.
For character counting, combining characters are counted as separate code points, so String.length does not match visible character count. Counting by grapheme clusters provides accurate results. Text processing algorithms books provide additional context.