In the latest release of emojis in 2015, human emojis were given skin color ??, gendered emojis included more women ?❤️?, and family emojis welcomed same-sex couples ????. With the 140-character limit on Twitter, you may have noticed that typing these newer emojis costs you more characters than before: one additional character for skin color, two characters total for flags, and up to seven characters for families. The reason for this has to do with Unicode and character encoding.
A look into past Internet petitions for emoji updates shows that many people think that companies like Apple or Google have control over the emojis on your phone. However, emojis are controlled by the Unicode Consortium and are defined in Unicode, a technology standard for writing characters — emojis included. Just like any other set of characters, emojis go through rounds of modifications and approval in technical documents written by the Unicode Consortium. The emojis we know today were first introduced in Unicode 6.0 in October 2010.
The release of Unicode 8.0 in 2015 added skin colors to human emojis based on the Fitzpatrick Scale, a classification of human skin types. The five available skin colors ????? are actual characters in Unicode, control characters known as symbol modifiers. When you select a human emoji with skin color from the emoji palette, behind the scenes a human character ? is sent along with a symbol modifier character ?. This is why tweeting an emoji with skin color counts as two characters.
Unicode uses control characters to make some existing emojis possible. To form a flag emoji, regional indicator symbols, another type of control characters, are combined according to two-character country codes: IT for Italy ??, FR for France ??, VE for Venezuela ??, and so on. As of Unicode 8.0, there are 257 flag emojis based on a list of internationally recognized countries. As countries fall, new countries rise, or flag designs change, Unicode avoids updating the set of flag emojis by not including actual flag characters at all, and instead only including the characters used to build these emojis.
The nuclear family emoji ? of one man, one woman, and one child was included in the original Unicode 6.0 spec and is considered a base emoji equal to one character. To expand the emoji palette to include non-nuclear families, zero-width-joiner characters are used to glue together existing characters for man ?, woman ?, girl ?, and boy ? to signify a grouping for a family ????.
Activist groups today continue to petition to ban the gun emoji, to correct “sexist” emojis, or even to add redheaded emojis. While none of these petitioners are actively involving themselves in the Unicode process, their voices seem to be getting heard. A drafted update to Unicode would expand emojis to include multiple attributes, such as gender, hair color, left-right direction, or locale. This new mechanic could allow diverse emojis such as a mixed-race family or a dark-skinned man with red hair.
With this proposed mechanic, Unicode would give companies like Apple or Google the power to interpret their very own palette of emojis. Future updates like this continue to open up the world of emojis to a much greater range of expression and possibilities.