I am setting a path to a file on the harddrive using the following interface:
void setPath(const char* path);
This path will be used for basic file I/O.
If I for example provide a path containing Chinese characters ( e.g. via QString::toUtf8()), this works fine for Unix, but of course doesn't for Windows because of the internal use of the wchar/wstring API.
I am now searching for an elegant way to make this interface UTF-8 compatible on both Windows and Unix based systems. Is there a way to avoid the wide API on Windows based systems and keep using std::string and std::ofstream() ?
After looking at boost::locale this appears to me as a possibility to handle UTF-8 encoding. Would this be a way to go ( replacing the std::ofstream by its boost::ofstream() counterparts for example ? )
const std::locale loc = generator.generate(std::locale(), "zh_CN.UTF-8");
std::locale::global(loc);
std::cout.imbue(std::locale());
boost::filesystem::path::imbue(std::locale())
All help is appreciated.
The Windows API does not support UTF-8, except in a few select APIs. Largely it only supports locale-dependent ANSI and UTF-16. To support Unicode without losing data, you have to use the UTF-16 based APIs.
Your interface will need to internally convert UTF-8 strings to UTF-16 when passing them to Windows API functions, and convert from UTF-16 to UTF-8 when receiving data from the API. There is no other way. This belongs in your underlying platform-specific logic, not in the higher-layer public interface.
You can use
std::string
for UTF-8, and there are plenty of ways to convert betweenstd::string
UTF-8 andstd::wstring
UTF-16 (there are even classes in C++11 to handle that).Microsoft has non-standard extensions to
std::ifstream
andstd::ofstream
in Visual Studio to accept UTF-16 filenames. Other vendors may or may not provide similar functionality.