Problem With Big Endian and Little Endian

176 Views Asked by At

i have big and little endian problem. I'm using a little endian system.

//C++ Code

#include <iostream>
#include <arpa/inet.h>

using namespace std;

int main(){
  uint16_t a = 0b0000000000000001;

  cout<<"Little Endian = "<<a<<endl;
  cout<<"Big Endian = "<<ntohs(a)<<endl;
  uint16_t c = ntohs(a);

  cout<<"Big Endian = "<<c<<endl;
  cout<<"Little Endian = "<<htons(c)<<endl;
  return 0;
}

OUTPUT :
Little Endian = 1
Big Endian = 256
Big Endian = 256
Little Endian = 1
                   

because i'm using little endian system ntohs will convert 16 bit to the big endian and htons will convert big endian to little endian.

in the code example a variable contain 0000000000000001

but there is one thing that is confusing :

00000000 00000001 = big endian format
00000001 000000000 = little endian format

OUTPUT :
Little Endian = 1
Big Endian = 256
Big Endian = 256
Little Endian = 1

now the question is how this 00000001 000000000 is represent 1 because 00000001 000000000 little endian format and how this 00000001 000000000 is represent 256, because those byte doesn't look like 1 and 256

2

There are 2 best solutions below

0
selbie On

because i'm using little endian system ntohs will convert 16 bit to the big endian and htons will convert big endian to little endian.

Actually, ntohs and htons are both the same function. On a little endian system, these functions flip the byte order of a 16-bit integer. On a big endian system, both functions are no-ops and just return value passed in.

This statement:

cout<<"Little Endian = "<<a<<endl;

Is not printing little endian. It's just printing the value of a independent of what the actual architecture encoding for that value is. Or in layman's terms, it's "printing the value of a".

If you want to see how the number is laid out sequentially in memory, you can do some simple casting hacks like this:

uint8_t* ptrA = (uint8_t*)(&a);
std::cout << (uint32_t)(ptrA[0]) << " " << (uint32_t)(ptrA[1]) << endl;

Technically, the above hack of casting a uint16_t* to a uint8_t* may trigger language lawyers to say "this is undefined behavior", but let's just go with it since it works well enough on most compilers and systems.

Let's extend your program to show what we can do with inspecting how values are laid out in memory.

// add your header files for htons support here either <winsock2.h> on Windows or probably <arpa/inet.h> on Mac/Linux
#include <iostream>
#include <iomanip>
#include <string>
#include <sstream>

using namespace std;

string toBinary(uint8_t value)
{
    string s;
    for (size_t i = 0; i < 8; i++)
    {
        uint8_t mask = 0x80 >> i;
        bool bitValue = ((value & mask) != 0);
        s += bitValue ? "1" : "0";
    }
    return s;
}

string toHex(uint8_t value)
{
    ostringstream ss;
    ss << hex << setw(2) << setfill('0') << (uint32_t)value;
    return ss.str();
}

bool isLittleEndianSystem()
{
    static uint32_t value = 1;
    bool firstByte = *(uint8_t*)(&value);
    return firstByte != 0;
}

uint16_t swapEndianess(uint16_t value)
{
    if (isLittleEndianSystem())
    {
        return htons(value);
    }
    else
    {
        uint16_t b1 = (value & 0xff00) >> 8;
        uint16_t b2 = (value & 0x00ff);
        return b1|(b2<<8);
    }
   
}

int main()
{
    uint16_t a = 1;

    uint16_t b = swapEndianess(a);
    uint8_t* ptrA = reinterpret_cast<uint8_t*>(&a);
    uint8_t* ptrB = reinterpret_cast<uint8_t*>(&b);
    bool isLittleEndian = isLittleEndianSystem();


    cout << "Your system's native encoding is " << (isLittleEndian ? "little" : "big") << " endian" << endl << endl;

    cout << "Let's inspect this value: " << a << endl << endl;

    // print in hex
    cout << "The hex byte sequence in " << (isLittleEndian ? "little" : "   big") << " endian is " << toHex(ptrA[0]) << " " << toHex(ptrA[1]) << endl;
    cout << "The hex byte sequence in " << (isLittleEndian ? "   big" : "little") << " endian is " << toHex(ptrB[0]) << " " << toHex(ptrB[1]) << endl;
    cout << endl;

    // print in decimal
    cout << "The decimal byte sequence in " << (isLittleEndian ? "little" : "   big") << " endian is " << (uint32_t)(ptrA[0]) << " " << (uint32_t)(ptrA[1]) << endl;
    cout << "The decimal byte sequence in " << (isLittleEndian ? "   big" : "little") << " endian is " << (uint32_t)(ptrB[0]) << " " << (uint32_t)(ptrB[1]) << endl;
    cout << endl;

    // print in binary
    cout << "The binary bit sequence in " << (isLittleEndian ? "little" : "   big") << " endian is " << toBinary(ptrA[0]) << " " << toBinary(ptrA[1]) << endl;
    cout << "The binary bit sequence in " << (isLittleEndian ? "   big" : "little") << " endian is " << toBinary(ptrB[0]) << " " << toBinary(ptrB[1]) << endl;

    return 0;
}

And when run, it will print this:

Your system's native encoding is little endian

Let's inspect this value: 1

The hex byte sequence in little endian is 01 00
The hex byte sequence in    big endian is 00 01

The decimal byte sequence in little endian is 1 0
The decimal byte sequence in    big endian is 0 1

The binary bit sequence in little endian is 00000001 00000000
The binary bit sequence in    big endian is 00000000 00000001

Now let's change a to somethign more fun like 1234

uint16_t a = 1234;

When the program is run with this change:

Your system's native encoding is little endian

Let's inspect this value: 1234

The hex byte sequence in little endian is d2 04
The hex byte sequence in    big endian is 04 d2

The decimal byte sequence in little endian is 210 4
The decimal byte sequence in    big endian is 4 210

The binary bit sequence in little endian is 11010010 00000100
The binary bit sequence in    big endian is 00000100 11010010
0
Dražen Grašovec On

Assignment uint16_t a = 0b0000000000000001

means that value of a is 1, regardless of Endianess. But order of bytes in memory will be different and depends on Endianess. But compiler hides that from user.

It makes no sense to use these functions in arithmetic sense. These functions are used to handle byte order of packets received over network, to "swap bytes".

htons() ntohs()

These two functions fill the format gap between the network and hosts. Technically, when hosts communicate over network, they are used to convert the packet’s byte order. If a host and network byte order are the same (that means they both uses big-endian), the two functions simply do nothing. When a host byte order is different from network (that means the host uses little-endian), htons() converts the data from little-endian to big-endian, and ntohs() converts from big-endian back to little-endian.

enter image description here

It is what they are intended to do, it doesn't make any sense to interpret results of these operations arithmetically as short int, because you will get different number when you swap bytes on any machine regardless it is small or big Endian.