Wednesday, November 5, 2014

Assembly Tutorial 1: Part 2 Basic Concepts

The code for this tutorial can be found at https://github.com/musicman89/Assembly-Tutorial

Part 1: Intro
Part 2: Basic Concepts
Part 3: Writing the Code
Part 4: Using QEMU and NASM

Basic Concepts
In this tutorial we will be covering a lot of the basic details of assembly programming including assembler directives, registers, opcodes, labels, and memory allocation.

Directives
We will start with assembler directives.  These tell the assembler how to compile the code.  We will only cover 3 of these in this tutorial.

  • BITS: This simply tells the compiler that we are writing code for 16 bit execution.  
  • ORG: This tells the compiler where this program is going to be located in memory.  In our first program we will declare ORG 0x7c00 which is the default location the bios puts the boot sector in memory.  
  • CPU: Tells the assembler what processor we are assembling the code for.  An example would be CPU 8086 if we wanted to make sure that our code is compatible with that specific processor.

Registers
Next we have the processor registers.  A register is in simple terms just a section of memory on the processor, basically a variable.  For this tutorial I will stick to the 4 general purpose 16 bit registers

  • AX: the Accumulator Register which is commonly used for Arithmetic
  • BX: the Base Register often used to store memory addresses
  • CX: the Count Register and commonly stores count values for loops
  • DX: the Data Register and is used to store the upper 16 bits of a multiply operation or the remainder of a divide operation

Each of these registers can be accessed as 2 8 bit registers by replacing the X with an H (High) or L (Low).  For example AX can be split into AH and AL.

Also note that no matter how you access the register it is the same memory so if we set the AX register to 0xFFFF and then set the AL register to 0xAA, the value of AX will now be 0xFFAA.

Opcodes
Now we get to the meat of assembly development.  The opcodes, these are the mnemonics that represent instruction calls in assembly language.  We will only cover a few basic ones in this tutorial and many more will be added in future tutorials.  At this point I will not explain how to use them, just simply explain what they do.

  • MOV: copies data from one location to another
  • INT: initiates a software interrupt we use this to call commands on the BIOS
  • JMP: moves the current point of execution to another place in the program
  • CALL: stores the current point of execution in memory and goes to another place in the program 
  • RET: sends us back to where the CALL instruction was made
  • HLT: stops the processor
  • CLI: clears the interrupts
  • ADD: performs addition
  • SUB: performs subtraction
  • MUL: performs multiplication
  • DIV: performs division


Memory Allocation and Labels
At this point we are almost to where we can write a program.  For this tutorial we will need memory allocation instructions as well as labels.

For static memory allocation we will be using the instructions DB (define byte) and DW (define word). These simply tell the assembler that what follows is data and not an instruction.

For dynamic memory allocation we use the stack, which at this point we will basically describe as a first in last out pool of data.  We use the PUSH instruction to put data into the stack and the POP instruction to retrieve it.  Think if it like a playing card discard pile.  The last card you put on the pile is going to be the first one you pick up if you take one from the pile.

As for labels, these are simply addresses that we name in the program so we can call them later.  The assembler will swap them out with the actual address.  A label is simply a name followed by a colon.  We can also have sub labels or labels within labels.  As a general rule these are prefixed with a '.'

No comments:

Post a Comment