SQLITE UTF-16 Encoding Issues

1k Views Asked by At

OK, I've been pulling my hair out for a couple of days on this issue. There are a couple of technologies at use here, first I'm using Unreal Engine 4 to develop an iOS game and I'm linking to a static lib of sqlite3, that I create the Database for on Windows.

On windows everything works fine, I create the database, and if you do Pragma encoding; it shows UTF-16LE.

However, when on IOS everything falls apart. First of all, if I even try to create a empty database in iOS using sqlite3_open16 function, it will create a database with a bunch of junk at the end of the name, and if I open it, and do pragma encoding it will say UTF-8 (empty database with no tables). If I try to connect to my existing one, I will have success 'randomly' sometimes, I think this has to do again with the weird characters that are appearing at the end of my string which I suspect is encoding issues.

The function being used to open the database is this:

bool Open(const TCHAR* ConnectionString)
{
    int32 Result = sqlite3_open16(ConnectionString, &DbHandle);
    return Result == SQLITE_OK;
}

Which works fine in windows but has the issues above in ios.

According to their documentation they use USC-2. From what I can tell in the sqlite source, it will use UTF-16LE. Do I need to do something to convert between these two? Or is there something else I might be missing here? Does anyone have any ideas? I'm hoping someone who might not be familiar with UE4 might still have some guesses.

edit: a list of things I've tried:

  1. Use the UTF-8 Functions SQLITE these appear to work fine. UE4 has a function TCHAR_TO_UTF8 and that worked.

  2. Try to use Objective C to ensure the encoding of UTF-16LE, this gave me the 'random' success I describe above. Besides not only appearing to only randomly work with the weird random text at the end of the string sometimes - anytime I try to pull data out of the database now, it comes back as mostly random question marks '????' with the occasional chinese character. The function I used to do this with is:

    const TCHAR* UChimeraSqlDatabase::UTF16_To_PlatformEncoding(FString UTF16EncodedString)
    {
    #if PLATFORM_IOS
        const TCHAR* EncodedString = (const TCHAR *)([[[NSString stringWithFString : UTF16EncodedString] dataUsingEncoding:NSUTF16LittleEndianStringEncoding] bytes]);
    #else
        const TCHAR* EncodedString = *UTF16EncodedString;
    #endif
        return EncodedString;
    }
    
  3. Tried using Unreals .AppendChar to add L'\0' to the end of the String, without including number 2's method, no success.

1

There are 1 best solutions below

0
On

If you're seeing weird characters at the end of the file name when calling sqlite3_16, it sounds like your UTF16 file name was not NULL terminated.

To specify the encoding of the database, you can actually create it with any of the sqlite3_open functions, but the key is that as soon as the database is created, you must immediately set the encoding:

PRAGMA encoding = "UTF-16le";

Once the encoding has been set, you can't change it, so make sure to do this first thing after creating the database.