What is Unicode SMS?
The term “Unicode SMS” refers to text messages sent and received containing characters not included in the default GSM character set. GSM stands for Global System for Mobile Communications (originally “Groupe Spécial Mobile”), and the GSM character set is a collection of the 128 letters (A-Z), numbers (0-9) and symbols (i.e. @, ?, !, &, etc.) most commonly used in mobile communications.Because GSM was developed in Europe and uses 7-bit binary code, it has its limitations. For example the 128-character limit does not have the capacity to include characters from the Cyrillic alphabet or the Chinese, Arabic and Thai languages, among many others. There are a number of different ways in which programmers have tried to overcome the shortcomings of GSM, and the most successful is the Unicode standard.
How the Unicode Standard Works
Instead of using 7-bit binary code, the Unicode standard uses 8-bit “code units” and combines up to four code units at a time. This extends the number of characters that can be encoded from 128 to 1,112,064 - enough capacity for all the major world languages to be encoded in a single character set. There is even a range of codes for Klingon, although this has not been officially endorsed by the Unicode Registry.
Cleverly, rather than using four code units to transmit every character, Unicode only uses those that are necessary. For example the binary code for a capital “A” could be expressed as [00000000 000000000 00000000 01000001], but instead is expressed as [01000001] in order to save space. If the full four code units were used in a text message, the number of characters allowed would decrease from 160 to 40.
How to Find Unicode Character Codes
Because keying in up to thirty-two binary numbers at a time can be time-consuming - and likely lead to errors - a Universal Coded Character Set Transformation Format (UTF) was developed. UTF makes it easier and quicker for computers to translate the binary code into a character via hexadecimal code. So, whereas the binary code for the dollar sign (“$”) is [00100100], the Unicode character code is “U+0024”.
With so many Unicode characters, looking up the code for one specific character can be like looking for a needle in a haystack. The Unicode Registry publishes code charts that are free to download and print, and there are several useful online resources available that can identify a character and provide its Unicode character code when you copy and paste the character or draw it freehand.
However, one of the simplest ways to find a Unicode character code is to copy and paste the character into a Word document. On a Windows operating system, you then press [Alt] and [x] together, and the character code replaces the character. On a Mac OS, using [Cmd] and [I] should do the same thing.
How to Insert Characters into a Unicode SMS Message
The method for inserting characters into a Unicode SMS message will vary depending on the device you are using to send the text message. People wanting to use Unicode characters in an SMS text message sent from a mobile device should find the Unicode character set included in their devices´ settings (Menu > Messages > Settings > SMS > Sending Preferences > Alphabet). If not, it should be possible to download a character set from the Original Equipment Manufacturer (OEM).
Windows and Mac users wanting to insert “non-standard” characters into a Unicode SMS message when sending an SMS text message from a bulk texting platform should have the character already copied and pasted into a Word document for the reverse look-up process described above. If so, you can simply copy and paste the character into the text message. If not, follow these steps:
- Windows users should type “charmap” into Cortana (Windows 10) or into the Search Windows option (Windows 7/XP). Select the “Advanced View” and enter the number value of the Unicode character code into the box entitled “Go to Unicode”. Then double-click the character, select “Copy” and paste the character into the text message.
- Mac users should go to the “Character Viewer” (formerly the “Character Palette”), click “Customize List” and then select “Unicode”. You will then see a character list that has a very useful search function. Search or scroll through the list until you find the character you need, right-click on the character to “Copy Character Info” and paste it into the text message.
The Advantages and Disadvantages of Unicode SMS Text Messaging
There is one major advantage to Unicode SMS text messaging, but unfortunately two potential disadvantages. The major advantage is that if your business operates a text messaging service, and a substantial proportion of your contacts speak Chinese, Arabic, Thai, (etc.), you can connect with them in their native language. This can give you a significant commercial advantage over other businesses in your sector, and help to develop relationships with potential customers.
One potential disadvantage of Unicode SMS text messaging is that the recipients of your messages must have mobile devices capable of translating the Unicode character codes into readable characters (otherwise they will appear as “□□□□□”). Another is that the more Unicode characters you use, the less space you have available to convey your message.
However, the majority of (for example) Chinese speakers should already have the Chinese character set on their mobile devices, and it doesn´t necessarily hurt to keep text marketing messages to within 70 characters - the usual maximum number of characters allowed in a Unicode SMS message using up to ten Unicode characters. If you can overcome these issues, the advantage of Unicode SMS text messaging far outweighs its potential disadvantages.
Find Out More about Unicode SMS from CallFire
If you would like to find out more about Unicode SMS text messaging, the Unicode standard and how to insert characters into a Unicode SMS message, you are invited to contact us and discuss your specific requirements with our industry-leading Client Success Management team. Unicode SMS opens a world of opportunity for your organization to connect with members of your community and potential customers. Don´t let the opportunity pass you by. Contact us today.
References
- https://en.wikipedia.org/wiki/GSM
- http://kunststube.net/encoding/
- https://en.wikipedia.org/wiki/UTF-8
- http://amilewide.npph.net/news/en/2011/01/sms-sur-portables-samsung-ou-est-passe-laccent-circonflexe/
- http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=UTTUsingUnicodeMacros
- http://www.dummies.com/computers/macs/macbook/macbook-all-in-one-for-dummies-cheat-sheet/