How Emoji Combinations Change Meaning - The Difference in Information Conveyed by One Character vs. Two or More

8 min read

When you pick "👨‍👩‍👧‍👦" from your phone's emoji keyboard, you think you are typing a single emoji. In reality, this family emoji is a concatenation of seven code points: "👨 + ZWJ + 👩 + ZWJ + 👧 + ZWJ + 👦." One character on the surface, seven underneath. In the world of emoji, combinations can dramatically change both the meaning and the character count of a single symbol.

ZWJ Sequences - Invisible Glue That Merges Emoji

ZWJ (Zero Width Joiner) is a "zero-width joining character" assigned to Unicode code point U+200D. It displays nothing on screen but acts as glue that bonds the emoji on either side into one.

The mechanism is simple. Place emoji A + ZWJ + emoji B in sequence, and if the OS or app has a glyph (rendered image) for that combination, A and B are displayed as a single merged emoji. If no matching glyph exists, A and B simply appear side by side. In other words, a ZWJ sequence is a flexible system: "merge if possible, otherwise just show them separately."

ZWJ SequenceComponentsCode PointsDisplay
👨‍👩‍👧‍👦👨 + ZWJ + 👩 + ZWJ + 👧 + ZWJ + 👦7Family (father, mother, daughter, son)
👩‍💻👩 + ZWJ + 💻3Woman technologist
🏳️‍🌈🏳️ + ZWJ + 🌈4Rainbow flag
👨‍🍳👨 + ZWJ + 🍳3Man cook
🧑‍🚀🧑 + ZWJ + 🚀3Astronaut
❤️‍🔥❤️ + ZWJ + 🔥4Heart on fire
👩‍❤️‍👨👩 + ZWJ + ❤️ + ZWJ + 👨5Couple

Family emoji are among the ZWJ sequences with the highest code point counts. The four-person family "👨‍👩‍👧‍👦" is 7 code points and 25 bytes when encoded in UTF-8. A single emoji consuming the same data as 25 English alphabet characters.

What makes ZWJ sequences interesting is that, in theory, you can attempt to join any two emoji. Entering "🐱 + ZWJ + 🐉" (cat and dragon) - an undefined combination - will not cause an error. Without a matching glyph, the cat and dragon simply appear side by side. The Unicode Consortium has officially defined about 600 ZWJ sequences, though vendors sometimes add their own support.

Flag Emoji - Two Regional Indicator Characters Paint One Flag

🇯🇵 (the flag of Japan) looks like a single emoji, but it is actually a combination of two characters: "Regional Indicator Symbol Letter J" (U+1F1EF) and "Regional Indicator Symbol Letter P" (U+1F1F5). The ISO 3166-1 alpha-2 country code "JP" is expressed using dedicated Unicode characters.

FlagCountry CodeRegional IndicatorsCode Points
🇯🇵JP🇯 + 🇵U+1F1EF U+1F1F5
🇺🇸US🇺 + 🇸U+1F1FA U+1F1F8
🇬🇧GB🇬 + 🇧U+1F1EC U+1F1E7
🇫🇷FR🇫 + 🇷U+1F1EB U+1F1F7
🇧🇷BR🇧 + 🇷U+1F1E7 U+1F1F7
🇰🇷KR🇰 + 🇷U+1F1F0 U+1F1F7

There are 26 regional indicator symbols (A through Z), yielding 26 x 26 = 676 theoretical combinations. However, only the roughly 250 country and territory codes registered in ISO 3166-1 actually display as flags. Unregistered combinations (e.g., "🇽🇽") may render as "XX" text or a blank, depending on the platform.

This design embeds a political consideration. The Unicode Consortium avoided the political judgment of "which regions count as countries" by not defining flag emoji directly, instead delegating to the existing international standard ISO 3166-1. If a new country gains independence and is registered in ISO 3166-1, its flag emoji becomes available automatically without any change to the Unicode specification.

As discussed in URL character limits, country codes appear in many corners of the internet. The ccTLD (country code top-level domain) ".jp" is also based on the same ISO 3166-1 country code.

Skin Tone Modifiers - One Emoji in Five Colors

Skin tone modifiers (Emoji Modifiers), introduced in Unicode 8.0 in 2015, change the skin color of human emoji. Five modifier levels are provided, based on the Fitzpatrick scale used in dermatology.

ModifierCode PointFitzpatrick TypeExample (👋 + modifier)
🏻U+1F3FBType I-II (light skin)👋🏻
🏼U+1F3FCType III (medium-light skin)👋🏼
🏽U+1F3FDType IV (medium skin)👋🏽
🏾U+1F3FEType V (medium-dark skin)👋🏾
🏿U+1F3FFType VI (dark skin)👋🏿

Adding a skin tone modifier turns a single emoji into 2 code points. "👋" (U+1F44B) is 1 code point, but "👋🏽" is "U+1F44B U+1F3FD" - 2 code points. In UTF-8, that is 4 bytes + 4 bytes = 8 bytes. Specifying a skin tone alone doubles the data size.

When ZWJ sequences and skin tone modifiers are combined, the code point count explodes. For example, a couple emoji with different skin tones, "👩🏻‍❤️‍👨🏿," consists of 👩 + 🏻 + ZWJ + ❤️ + VS16 + ZWJ + 👨 + 🏿 - 8 code points. It looks like a single emoji, yet internally it carries roughly the same data as the English phrase "Hi there!" (9 characters).

Emoji Slang - A Hidden Language Born from Combinations

Emoji are used not only with their official meanings but also as slang within user communities. Individually harmless emoji can take on entirely different meanings when combined.

The most famous example is probably 🍑🍆. A peach and an eggplant - food emoji - but on social media they are widely recognized as sexual innuendo. In 2019, Instagram restricted search results for posts containing this combination.

Emoji CombinationCharacter CountOfficial MeaningSlang Meaning
🍑🍆2 charsPeach and eggplantSexual innuendo
🧢1 charBaseball capLie (cap = to lie)
💀1 charSkullDying of laughter
🐐1 charGoatGOAT (Greatest Of All Time)
👁️👄👁️3 charsEyes and mouthShocked / bewildered face
🫠1 charMelting faceEmbarrassed / flustered
🤡1 charClownSomeone who did something foolish

"👁️👄👁️" is a form of "emoji art" that arranges three emoji to create a face: eye + mouth + eye, expressing an "indescribable expression." This three-character combination went viral on TikTok around 2020, used to convey shock, bewilderment, or a sense of "I just saw something."

Emoji slang varies by generation and region. In Japan, 🙏 is used to mean "please" or "thank you," while in Western countries it is sometimes interpreted as a "high five." As explained in emoji Unicode and character counting, the technical character count of an emoji and the "amount of meaning" a human perceives are on entirely different planes.

How Different Platforms Count Emoji Characters

Emoji character counting varies significantly across platforms. The same emoji posted on Twitter (X) and Instagram consumes a different number of characters.

PlatformEmoji Counting Method👨‍👩‍👧‍👦 Count🇯🇵 Count
Twitter (X)Character count after NFC normalization1 char (= 2 chars consumed)1 char (= 2 chars consumed)
InstagramUTF-16 code unit count11 chars4 chars
LINEProprietary counting1 char1 char
SMSUCS-2 (16-bit)7 chars2 chars
JavaScriptUTF-16 code units.length = 11.length = 4

Twitter (X) is relatively generous, treating ZWJ sequence family emoji and flag emoji as a single emoji as they appear visually (though internally each counts as 2 characters). As detailed in Twitter's character limit, within the 280-character limit each emoji counts as 2 characters.

JavaScript's .length property, on the other hand, returns the number of UTF-16 code units, so emoji containing surrogate pairs return a value larger than the visual character count. The family emoji "👨‍👩‍👧‍👦" has a .length of 11. To get an accurate count, you can use Array.from(str).length or [...str].length, but even these decompose ZWJ sequences and return 7. To count by grapheme cluster, use the Intl.Segmenter API.

You may also find the summary of SNS character limits helpful. Knowing how each platform counts characters can save you from "character limit exceeded" headaches when posting emoji-heavy content.

Emoji Shiritori and Movie Title Guessing - Character Counts in Play

Emoji-based games also yield interesting discoveries from a character count perspective.

"Emoji shiritori" is a game where you play the Japanese word chain game using emoji names. 🍎 (ringo / apple) -> 🦍 (gorira / gorilla) -> 🍛 (ramen)... and so on. The rules are simple, but it is surprisingly hard if you do not know the official names. For example, the official name of 🫥 is "Dotted Line Face." Every one of the roughly 3,600 emoji has a localized name defined in Unicode's CLDR (Common Locale Data Repository).

"Guess the movie title from emoji" is another popular game. For instance, "🦁👑" is "The Lion King" (2 characters representing a 14-character title), "👻👻👻🔫" is "Ghostbusters" (4 characters for a 12-character title), and "🧙‍♂️💍🌋" is "The Lord of the Rings" (3 characters for a 21-character title). Emoji combinations might be considered a kind of "ultra-compressed language" that conveys meaning while drastically reducing character count.

Getting the "True Character Count" of Emoji in Code

For developers, counting emoji characters is a headache because different languages and runtimes return different values.

Language / EnvironmentMethodResult for "👨‍👩‍👧‍👦"Counting Unit
JavaScript"👨‍👩‍👧‍👦".length11UTF-16 code units
JavaScript[..."👨‍👩‍👧‍👦"].length7Code points
Python 3len("👨‍👩‍👧‍👦")7Code points
Swift"👨‍👩‍👧‍👦".count1Grapheme clusters
Rust"👨‍👩‍👧‍👦".len()25Bytes (UTF-8)
Golen("👨‍👩‍👧‍👦")25Bytes (UTF-8)

Only Swift returns "1" because Swift uses grapheme clusters as its unit of character measurement. This is the result closest to human intuition, though it comes at a higher internal processing cost. To get the same result in JavaScript, use Intl.Segmenter.

As explained in Unicode fundamentals, the definition of "character count" varies by context. Emoji combinations bring this issue into the sharpest focus. Just as full-width and half-width character counting differs, remember that emoji counting methods also vary by platform and language.

The Future of Emoji - Limitless Combination Possibilities

As of Unicode 16.0 (2024), the total number of emoji is approximately 3,790. However, when ZWJ sequences and skin tone modifier combinations are included, the expressible variations reach tens of thousands.

In 2024, "directional modifiers" were introduced, allowing the orientation of human emoji to be flipped left or right. Adding a directional modifier to 🏃 (person running) produces 🏃‍➡️ (person running to the right). This is yet another factor increasing code point counts.

Emoji combinations have dramatically expanded the expressive power of text communication. In everyday chat, you do not need to worry about how many code points make up a single emoji. But when posting to character-limited social media, designing database character fields, or processing strings in code, the gap between "visual character count" and "internal character count" can become an unexpected pitfall.

As also mentioned in the article on LINE message character counts, messages heavy on emoji carry more data than text-only messages. Next time you pick an emoji, it might be fun to imagine just how many code points are hiding behind that single character.

If emoji and Unicode have piqued your interest, you can explore related books on Amazon.

Share this article