Learning x86 with NASM — Working with Characters, Lists and Strings
In this article, we will create a program that works with characters, strings, and lists using the .data section.
Declaring Characters in NASM x86
In x86, we can declare data of different sizes in the .data section. To declare a character, we need to consider what size a character of text is. For ASCII characters, we consider one character to be 8 bits in size. If we want to declare an ASCII character, we will use DB as the data format. If we want to use a more extensive character set like Unicode (UTF-32), we would use DD to allocated 32-bits in size. For the value, we would place a single character with single quotes surrounding it, as shown in the following code.
This example defines an 8-bit variable named char with a value of A. Since the value is 8-bits, we can assume that this is an ASCII character. Note that when you use the value of char, you will see its numerical ASCII encoding rather than the character A. In this example, this would mean that the variable value of 65 rather than A. A full list of ASCII value mappings can be found in an ASCII chart like: http://sticksandstones.kstrom.com/appen.html
To move this data into a register, we need to make sure the register size fits the size of the data. For our 8-bit character, this means that the register we place the data in must also be 8 bits in size. To accomplish this, we need to use the lower or upper 8 bits of a register. For example, to move the data into the b register, we would use bl.
In this example, the bl register now stores the value 65, which corresponds with the character A in ASCII encoding.
Lists and Strings
A string is basically just a list of characters, so before we approach strings, let’s discuss a bit about lists. A list can be declared in the data section using a comma-separated list of values, like the following code:
To access the first element of the list, you can use [list]. To access the second element, you simply add 1 to the list, giving you [list+1]. The next element is at list+2, and this continues until you reach the end of the list.
When you declare a list, each item in the list is given the same amount of memory for storage. In this example, we use DB, meaning each item in the list if given a byte of memory. For strings, we can define each character based on the encoding we wish to use. The example before shows how you can define an ASCII string.
We can interact with the string in the same way as the list. The only thing that changes is the definition. We place the characters for our string in quotations and follow the definition with a 0. The 0 is used to define the end of the string. This is often referred to as a null character.
You have now successfully learned how to store and reference characters, lists and strings in NASM x86!