My objective is to do comparative study of a few instruction set architectures.
For each instruction set architecture, how can i find the most commonly used instructions?
This is the steps i am thinking of:
- Find common ISAs for a chosen domain
- Find popular programs for each such ISA
- Disassemble the program instructions (.code) (which tool?)
- Collect statistics on instruction format, opcode, type. (which tool?)
Here is a very good study on x86 machine code statistics: https://www.strchr.com/x86_machine_code_statistics
I have tried below command for disassembling, but it does not seem to disassemble properly. Disassembled code shows some das
instructions, which should not be present in actual code.
ndisasm -b32 -a $(which which)
You can try this, to gather mnemonics from .text section:
After that you can either get only mnemonics name, appending
awk '{print $1}'
to previous command, or mutating data somehow different.After all of this add
sort | uniq -c
to previous steps. So my resulting command looked like:Which prints out frequencies of every mnemonic from program's text section