Implementing Decision Structures and Loops in x86 Assembly
The compare instruction in x86 allows us to implement a similar logic to if statements. The compare instruction typically has two components to it, the compare instruction, and the jump operation. The compare instruction takes in two memory locations and completes a comparison of them. The jump operation then instructs the processor to move to a specific location if the condition we are checking is met.
When we do a comparison in assembly, the processor is comparing the two items provided using subtraction. It will subtract the first item from the second item and then set specific flags based on the result of the operation. These flags are stored in the %eflags register and the jump operation uses the %eflags register to determine the result of the comparison, and act accordingly.
The jump instruction is used to jump to a specified label if the condition it is checking is true. The following jump instructions exist in x86:
1. je: Jumps to the label specified if the values compared were equal
2. jg: Jumps to the label specified if the second value was greater than the first value
3. jge: Jumps to the label specified if the second value was greater than or equal to the first value
4. jl: Jumps to the label specified if the second value was less than the first value
5. jle: Jumps to the label specified if the second value was less than or equal to the first value
6. jmp: Jumps to the specified label no matter what
As an example, suppose we want to compare the value in %eax to the value in %ebx, and jump to the label done if %ebx is greater than %eax. To do this, we would compare with %eax as the first argument, and %ebx as the second argument, and do a jg to done
The first thing our program does is move 10 into register %eax and 7 into %ebx. It then compares %eax to %ebx. If %ebx is greater than %eax, the program jumps to the label done. From here, the program will move 1 into %eax (which tells the system to exit the application on interrupt), then calls the system interrupt.
If we don’t have a situation where %ebx is greater than %eax, we proceed passed the jump to the next line, where we move the contents of %eax into %ebx, before overwriting for the system interrupt. When this program is compiled and run, using the command echo $? will display whichever register had the largest number.
We can take this idea one step further and create an application that finds the largest number in a list of numbers found in memory. This will follow a very similar logic, just with a few added components. For one, we will need to define a list of numbers in the .data section to work with. Once this is done, we will need to store the index of the list we are at, and the current maximum. From here, we continue forward comparing to the maximum, and updating as required.
If we were to program this in a high level language like Python, it would look something like below.
Now let’s see how this logic is implemented in x86. I’ve added comments to the code to show which x86 instructions correspond with which Python instructions.
This shows that a fair amount of our assembly code can map almost directly to high-level code. The main things that differ are the order, and a few extra instructions to facilitate storage, loops, and compares. We start by declaring our list of numbers in the data section. Once we move into the main, we setup the index, and get the first value in the list to set as the maximum. We then loop comparing the current value to the maximum updating as needed, then interrupt to end the loop.
When we declare the list of numbers we are working with, we give it a data type, .long. This is how we know the use 4 for our index reference, data_items(,%edi,4). A long in x86 will take 4 storage units of memory, so declaring our type allows us to know exactly how the data will be stored. The following data types are most commonly used in x86:
1. .byte: Bytes take up one storage unit and are limited to numbers between 0 and 255.
2. .int: Ints take up two storage units and are limited to numbers between 0 and 65535
3. .long: Longs take up four storage units, which is the size of a register. They can hold numbers between 0 and 4294967295
4. .ascii: Ascii take up one storage location in memory, and are converted into bytes internally. There is a mapping of bytes to ascii characters, which is available from many resources online.
One final note to make is about how we access the data from the data_items structure. For this task, we are using the indexed addressing mode. When we do this, we are specifying our index, and then the number 4. Since each long takes up 4 storage units, we can use index*4 to get to a specific index. This means that 4 holds the first number, 8 holds the second, and so on.
When you compile and run this application, you will see 222 as the result.