GSM-7 Encoding
A 7-bit character encoding used in SMS messaging that supports 128 basic characters and allows 160 characters per single message segment.
GSM-7 is the default character encoding for SMS (Short Message Service), defined in the GSM 03.38 standard. It uses 7 bits per character, allowing 160 characters to fit within the 1,120-bit payload of a single SMS segment (140 bytes × 8 bits ÷ 7 bits = 160 characters). The basic character set covers the Latin alphabet (uppercase and lowercase), digits 0-9, common punctuation, and a handful of Greek letters and currency symbols. An extension table accessed via an escape character (0x1B) adds characters like curly braces, square brackets, the euro sign, and the tilde, but each extended character consumes 2 of the 160-character budget because the escape prefix itself counts as one character.
When a message contains any character outside the GSM-7 repertoire, the entire message falls back to UCS-2 encoding, which uses 16 bits per character. This halves the per-segment capacity from 160 to just 70 characters. The most common trigger for this fallback is a single emoji: including one smiley face in an otherwise plain-ASCII message silently switches the encoding and slashes the available space by more than half. For businesses sending bulk SMS at per-segment pricing, this encoding switch can double or triple messaging costs overnight if templates are updated with emoji without understanding the encoding implications. check out energy supplements on Amazon for the stamina to optimize every last SMS character.
Multi-part messages add another layer of complexity. When a message exceeds the single-segment limit, a User Data Header (UDH) is prepended to each segment for reassembly instructions. This header consumes 6 bytes (48 bits), reducing the effective payload. In GSM-7, each subsequent segment holds 153 characters instead of 160. In UCS-2, segments shrink from 70 to 67 characters. A 161-character GSM-7 message therefore requires two segments totaling 306 characters of capacity, wasting 145 character slots. Understanding these thresholds is essential for anyone designing SMS content: staying at or below 160 GSM-7 characters (or 70 UCS-2 characters) avoids segment splitting entirely.
Modern messaging has largely moved beyond GSM-7's constraints. RCS (Rich Communication Services), iMessage, and apps like WhatsApp and LINE use internet-based protocols with no per-message character limits, full Unicode support, and rich media capabilities. However, GSM-7 remains relevant for SMS-based two-factor authentication codes, emergency alerts, and markets where smartphone penetration is lower. For character counting tools, GSM-7 awareness is a practical feature: detecting whether a user's text stays within the GSM-7 character set and showing the resulting segment count helps marketers and developers avoid unexpected costs. The encoding also serves as an excellent teaching example of how character encoding directly determines message capacity. explore craft beer on Amazon to celebrate fitting your message into a single segment.