I am sort of a newbie to assembly language, and I need help understanding how mnemonics are converted directly to bytes.
For example, I have a line saying
b 0x00002B78
which is located at the memory address 0x00002A44. How does this translate to EA00004B (the byte representation of the above assembly)? I am under the impression that the "EA00" signifies the "b" branching part of the assembly, but what about the "004B"? If anyone can give a general understanding of this and resources to find conversions and such, that would be appreciated. I tried googling this but I am really not to sure what to google exactly. The stuff I have been googling has not been helpful.
All the information you're looking for is in the ARM Architecture Reference Manual. If you look up the
b
instruction, you'll see its encoding and how it works. Here's the specific instruction you care about:The
E
is the condition field, which you can look up in this table:For you, it's "execute always". Then the
A
, which in binary is the1010
to match bits 27:24 (you have a branch instruction, not a branch & link instruction). Lastly the rest of the instruction is the immediate offset field. It's a PC-relative offset, which is why it's encoded as0x00004b
.Let's look at your specific example now. You have the instruction:
located at address
0x00002a44
. OK, great. So first off, we can stick in the opcode bits:Now, the
L
bit is zero for our case:We want to execute this instruction unconditionally, so we add the
AL
condition code bits:And now all we have to do is calculate the offset. The PC will be
0x2a4c
when this instruction is executed (the PC is always "current instruction + 8" in ARM), so our relative jump needs to be:Great - now we apply the reverse of the transformations described in the documentation above, rightshifting
0x12c
by two:And that's the last field:
Turning that binary instruction back into hex gives you the instruction encoding you were looking for: