What are some good courses online to learn assembly code. I have seen a few courses but they all teach using .asm files. I am trying to learn assembly where files have an extension of .s
Also what are main differences between these 2 types of assembly files? Is one better than the other?
I've also been hearing lots of "ARM", "x86" and "x86-64". What exactly are they? The one i would like to adopt learning is "ARM".
Thanks in advance!
Assembly language is specific to the assembler (the tool) not the target (the processor). There are many incompatible x86 assemblers and assembly languages, as well as many ARM assemblers and assembly languages.
x86 implies a specific instruction set from Intel the 8088/86 and derivatives over decades. Instructions like mov al,10h. The instruction set has been expanded on upon and is reverse compatible. AMD and a very small list in the past make x86 compatible/licensed processors meaning they run the same instructions but the chips are not made by Intel.
ARM describes a short list of instruction sets that run on ARM processors starting with Acorn that made chips, to post-acorn only making IP not chips. For every x86 processor/core, you have a few to several to many ARM based items in your life (as well as other processors). ARM is not a bad first instruction set and the gnu assembler and gnu tools are a good place to start (binutils).
The file name is not relevant, some tools have a default, some don't care (for example if you feed .s or .S to gnu gcc it knows that this is not a C file). The filename extension in general does not have meaning. In general you feed the whole filename to the tool, so in cases for example where you are using the wrong tool (gcc) it might help the tool figure things out, but for a real assembler it would be up to the assembler itself it it cared. But the extension does not indicate x86 vs ARM vs MIPS, etc...
ARM has multiple instruction sets, ARM from architectures v1 through v3 (the Acorn days) then ARM v4 through ARMv7 are an extension of that v3 to v4 removed some otherwise it has been additions. The ARMv4 to ARMv7 also have various versions of thumb instructions, where are instructions are 32 bit thumb instructions are 16 bit based. Then there is Jazelle and multiple floating point instruction sets. And then the 64 bit instruction set which is a whole other thing. Probably best to start with thumb on an emulator or microcontroller then move up to ARM. You can bang out an instruction set simulator for thumb in a weekend, and learn the instruction set better than most people who use it regularly.
An instruction set is just that a set of instructions. One model/brand of coffee maker may have a toggle switch to turn the thing on. Another a push button. Another you just plug it in or unplug it. They all perform ON/OFF instructions to that coffee maker but the specific details of how to perform that operation vary.
Instructions are bits, processors are incredibly dumb, the smarts are all in the programmer that lays out a list of instructions in order, the processor just does what it is told even if told to crash and burn. So while x86, ARM, MIPS, etc all have registers and a register can hold the value 123, the names of the registers, the instruction to put 123 into a particular register, the specific bit pattern that tells the processor to perform that operation, varies from instruction set to instruction set. Then beyond that instead of having to type in binary or hex to write programs humans prefer to use something human readable.
Instead of writing programs like this
we can use this instead:
But due to assembler (the tool) authors histories and preferences the same instruction, the SAME MACHINE CODE, can be created from different assembly languages. Not uncommon to find these styles (not that this instruction set has someone that made tools for these)
So:
These are various assembly languages and instruction sets. They all perform the same basic function put the value 0x99 in one of the processors registers. As you can see though, the machine code, the bits, and the syntax, varies by assembler and target. A specific processor may only support one or a few instruction sets, you have to match the machine code to the processor and configure the processor for that machine code if the processor supports more than one instruction set.
Each instruction set has instructions and ideally has one or more assemblers which means one or more assembly languages. Once you learn one instruction set the next and the next are much easier. Some instruction sets are not the best to start with, newer ARM and X86 and others have a significant amount of logic related to protection and things other than the instruction set itself that can impede learning the basics. So older or simpler instruction sets are better as a first one. And some have more rules and are less orthogonal than others. The pdp11 and msp430 are very good first instruction sets, there are simulators and you could bang out your own in an afternoon.
You can buy an msp430, while pdp11's are available getting one running is not necessarily trivial nor cheap. Thumb is a good start and there are emulators and inexpensive hardware and the jump from thumb to ARM is fairly simple. And their documentation is good. Many will argue their favorite or the only one they know is the best first one. Until you try you will not know, but I suggest you do not stop at one. The gnu tools (binutils and gcc) support many different instruction sets, and the tools are debugged so you can compare the tools output to the documentation for that processor and figure out what is going on...