As this question is some years old Is C++20 'char8_t' the same as our old 'char'?
I would like to know, what is the recommended way to handle the char8_t and char conversion right now? boost::nowide (1.80.0) doesn´t not yet understand char8_t nor (AFAIK) boost::locale.
As Tom Honermann noted that
reinterpret_cast<const char   *>(u8"text"); // Ok.
reinterpret_cast<const char8_t*>("text");   // Undefined behavior.
So: How do i interact with APIs that just accept const char* or const wchar_t* (think Win32 API) if my application "default" string type is std::u8string? The recommendation seems to be https://utf8everywhere.org/.
If i got a std::u8string and convert to std::string by
std::u8string convert(std::string str)
{
    return std::u8string(reinterpret_cast<const char8_t*>(str.data()), str.size());
}
std::string convert(std::u8string str)
{
    return std::string(reinterpret_cast<const char_t*>(str.data()), str.size());
}
This would invoke the same UB that Tom Honermann mentioned. This would be used when i talk to Win32 API or any other API that wants some const char* or gives some const char* back. I could go all conversions through boost::nowide but in the end i get a const char* back from boost::nowide::narrow() that i need to cast.
Is the current recommendation to just stay at char and ignore char8_t?
                        
As pointed out in the post you referred to, UB only happens when you cast from a
char*to achar8_t*. The other direction is fine.If you are given a
char*which is encoded in UTF-8 (and you care to avoid the UB of just doing the cast for some reason), you can usestd::transformto convert thechars tochar8_ts by converting the characters:C++23's
ranges::towill make using a named return variable unnecessary.For dealing with
wchar_tinterfaces (which you shouldn't have to, since nowadays UTF-8 support exists through narrow character interfaces on Windows), you'll have to do an actual UTF-8->UTF-16 conversion. Which you would have had to do anyway.