I'm reading PNG files from disk and inserting the bytes into a vector of chars, instantiated with the new keyword.
Opening the file:
std::ifstream file("./images/orange.png", std::ios_base::binary);
Instantiating a vector of length 8 (#define PNG_SIGNATURE 8), where all elements are initially an empty char:
std::vector<char>* header_buff { new std::vector<char>(PNG_SIGNATURE, ' ') };
Writing bytes to the vector, where f is a reference to the file, and b is a pointer to my vector buffer:
std::vector<char> chunk_reader(std::vector<char>* b, std::ifstream &f) {
char c { };
for(int i { }; i < (*b).size(); ++i) {
f.get(c);
(*b)[i] = c;
};
return *b;
};
The first byte of a PNG is always going to be HEX 0x89.
According to the reference material, the std::ifstream member type is of type char.
My question is, on my implementation: MinGW GCC for Windows, I'm able to debug my program and see the first byte of the PNG as a signed char, decimal value -119. I think for ease of parsing a stream of binary, I'd want to use unsigned char, so if I were to seek to a particular byte of interest, I could check if its unsigned value is correct in the source code as 137 instead of its signed representation.
To my knowledge, and based on the posts I've read on here with similar questions, C++ leaves the signedness of a char to the implementation for flexibility. So, if I insert the gcc flag -funsigned-char, I get the behavior I expect, and can visually see in the debugger the decimal value 137 for the first byte.
I've read on here from more experienced programmers that this is a band-aid, and to keep something like this to the source code for readability, aka a reinterpreted cast from char to unsigned char, which to my knowledge makes sense since they're both one byte of information.
But, then I see posts saying reinterpret cast is hacky, and should be left alone to the use cases it was designed for in the standard library.
Can someone offer some advice as to what is the best practice in this situation?
I'd ultimately like to perform certain validations of individual bytes, like in the case of a PNG the first byte of a named chunk will have a special meaning if it's an uppercase or lowercase ASCII character - which is simple if I'm dealing with unsigned chars and can use decimal values in the source code, in a switch statement for example.
I'm very new to C++, so I'd appreciate your advice. I'm interested in the best practices and scalability of designing such a system.
Make the vector the type you need and cast the ifstream input at the earliest opportunity.
Moreover I'd suggest that if this is binary data rather than text, you use
std::uint8_t.