Supporting Unicode and Emoji characters in Unity3D (or other Engine) Game - The Lazy Way
Recently I had been working on a client project. Most of their targeted customers were not English speakers. As a result, they wanted to support multiple languages in their games. If that was not enough, they also wanted to support Emoji's in there.
We were developing their game on Unity3D. Most of the game's actions were validated and executed on the backend, which was configured on LAMP [Linux, Apache, MySQL and PHP]. By the time this issue was raised, most of the work was already done. The Database structure and the game data was already populated.
The Mess
A quick search on Google was enough to confuse you even more. Turns out, Unicode characters can be supported by various encodings like UTF8 & UTF16. Add to that, there are multiple versions of UTF8. And there are multiple versions like UTF-8, CESU8, UTF-8 modified. And then there was MySQL which has its own sets of character encoding. MySQL supports UTF-8. But if you also want to use Emoji, you need to use a special version of UTF-8, which is known as utf8mb4. So despite the U in UTF standing for Universal, there is nothing Universal in UTF.Which one's to use?
Since my database was already setup, I wanted the shortest possible way to get this sorted out. I configured the database to support utf8mb4 and found out that even though the non-english characters were showing up properly, the emoji's were not visible on the mobile devices. Turns out, even Emoji's are not standardised. There are at-least 4 different [Docomo, Softbank, Google etc. etc. etc.] versions of Emoji character space in UTF.
Fortunately, A guy named Cal did the awesome job of creating a PHP library which provides some semblance to the Emoji mess, which is linked below.
http://code.iamcal.com/php/emoji/
However, the above library will work only for web pages. Inside an app, it'd be difficult to slice an image and show the appropriate emoji.
The Lazy Solution
Long story short, I finally though of converting the Unicode binary sequence to ASCII text. I did check a few options, before finally settling for Base 64 Encoding.
$asciiString = base64_decode($unicodeString); // PHP Code
What this does is convert your unicode string with Emoji [or anything else] to an encoded ASCII string, which you can store in your normal database, without altering the DB's or table's encoding. The flipside is, that the string length will be around 3-4 times the Unicode string's length, so perhaps you'll need to increase the fields length based on your requirements.
When you need the Unicode back to display it on device, use the below method.
$unicodeString = base64_encode($asciiString); // PHP Code
You can of course do this inside the game code instead. In Unity, you can include the following libraries:
We were developing their game on Unity3D. Most of the game's actions were validated and executed on the backend, which was configured on LAMP [Linux, Apache, MySQL and PHP]. By the time this issue was raised, most of the work was already done. The Database structure and the game data was already populated.
The Mess
A quick search on Google was enough to confuse you even more. Turns out, Unicode characters can be supported by various encodings like UTF8 & UTF16. Add to that, there are multiple versions of UTF8. And there are multiple versions like UTF-8, CESU8, UTF-8 modified. And then there was MySQL which has its own sets of character encoding. MySQL supports UTF-8. But if you also want to use Emoji, you need to use a special version of UTF-8, which is known as utf8mb4. So despite the U in UTF standing for Universal, there is nothing Universal in UTF.Which one's to use?
Since my database was already setup, I wanted the shortest possible way to get this sorted out. I configured the database to support utf8mb4 and found out that even though the non-english characters were showing up properly, the emoji's were not visible on the mobile devices. Turns out, even Emoji's are not standardised. There are at-least 4 different [Docomo, Softbank, Google etc. etc. etc.] versions of Emoji character space in UTF.
http://code.iamcal.com/php/emoji/
However, the above library will work only for web pages. Inside an app, it'd be difficult to slice an image and show the appropriate emoji.
The Lazy Solution
Long story short, I finally though of converting the Unicode binary sequence to ASCII text. I did check a few options, before finally settling for Base 64 Encoding.
$asciiString = base64_decode($unicodeString); // PHP Code
What this does is convert your unicode string with Emoji [or anything else] to an encoded ASCII string, which you can store in your normal database, without altering the DB's or table's encoding. The flipside is, that the string length will be around 3-4 times the Unicode string's length, so perhaps you'll need to increase the fields length based on your requirements.
When you need the Unicode back to display it on device, use the below method.
$unicodeString = base64_encode($asciiString); // PHP Code
You can of course do this inside the game code instead. In Unity, you can include the following libraries:
System
System.Text
System.Text
And then Base64 encode a Unicode string to ASCII by using:
byte[] byteStream = Encoding.UTF8.GetBytes (unicodeString);
string asciiString = Convert.ToBase64String (byteStream);
string asciiString = Convert.ToBase64String (byteStream);
And then revert back to Unicode from ASCII using:
byte[] byteStream = Convert.FromBase64String (asciiString);
string unicodeString = Encoding.UTF8.GetString (byteStream);
string unicodeString = Encoding.UTF8.GetString (byteStream);
The Unicode string supports all known languages and Emoji's, and there's also the option of using the private space provision. ;)
Conclusion
This may not be the most elegant solution. Perhaps there are better solutions out there. This solution gets the job done, and is not messy. If you're stuck with a similar issue, use this solution. If you've a better solution, I'd love to know. :)
If this post was helpful, don't forget to Like and Share!
Comments
Someday I'll name a Star after you (if only I knew your name :D )!