Caveats reinterpret_cast'ing char* to unsigned char*?

35 Views Asked by At

I'm trying to fit my C++ app to a C-API. The API in question is mbed_tls which contains a base64 decoder:

int mbedtls_base64_decode( unsigned char *dst, size_t dlen, size_t *olen, const unsigned char *src, size_t slen )

The problem is that I'm arriving in site with a std::string and I can't put its c-string as a function argument. If I do I get this error:

<source>:17:27: error: invalid conversion from 'char*' to 'unsigned char*' [-fpermissive]
   17 |     mbedtls_base64_decode(buf, out_buf_size, &written, token.c_str(), in_buf_size);
      |                           ^~~
      |                           |
      |                           char*

(same repeats for the input string)

Code for review (view on godbolt):

#include <string>

extern "C" int mbedtls_base64_decode( unsigned char *dst, size_t dlen, size_t *olen, const unsigned char *src, size_t slen )
{
    /* dummy */
    return 0;
}

std::string token = "some token that I received over the internet";

int main()
{
    constexpr size_t in_buf_size = 10, out_buf_size = 10;
    
    size_t written;
    char buf[out_buf_size];
    mbedtls_base64_decode(buf, out_buf_size, &written, token.c_str(), in_buf_size);
}

, my code carries around a string<char*>. When I try to push my std::string's c-string into

Question:

What could happen in the worst possible case if I just use reinterpret_cast<unsigned char> on my input strings? Why is the C-API even requiring unsigned char? If that's the right representation for characters, then why isn't std::string<unsigned char> the default?

1

There are 1 best solutions below

0
Caleth On

What could happen in the worst possible case if I just use reinterpret_cast<unsigned char> on my input strings?

You can always reinterpret_cast to unsigned char, and you are always able to reinterpret_cast signed to unsigned and vice-versa, so you are doubly safe.

Why is the C-API even requiring unsigned char?

Ask the author. At a guess, because arithmetic on unsigned values is safer, underflow and overflow are defined.

If that's the right representation for characters, then why isn't std::string the default?

Because you generally don't need to do arithmetic on characters.