Ok, I’m in the mood to do some technical blogging. I was reading an article recently about the fact that many universities are teaching computer science using languages such as Java that don’t use pointers. I’m starting to feel like one of those old guys saying things like “back in my day….”. Here’s the article I was reading a while back: The Perils of JavaSchools.
First, I’m going to give you a bit of history about me. I spent 6 years of my youth in the Navy as an electronics technician. During that time I bought a computer that was just small enough to fit in the tech cabinet in the transmitter room (we kept tools and parts in that cabinet and I stored my computer on the bottom shelf). The computer I bought was an Apple Macintosh. Oh yeah, “THE” Macintosh. With a grand total of 128k of memory. This machine was just enough to whet my appetite for programming. Of course, I only knew two languages at that time. The languages I knew were Basic (with line numbers) and 8080 assembly (from a computer board I built by wire-wrapping, I’ll do a post on that later). Anyway, Basic was very hard to use with line numbers and Apple upgraded the mac to a 512k version within months of my purchase. I upgraded my mac and I ran to the Apple store and looked for a new language. I saw a box of Mac Pascal and the syntax looked very much like Basic, so I bought it. Oooh, it was really cool and shiny. Yup, I taught myself Pascal just so I could program games. I had great ambitions back then.
I’m writing code for a game I had in mind and suddenly, I discovered that Mac Pascal can only access 32k of global memory. That was a bummer because I couldn’t store my game board and all the playing pieces in memory. So I bought some books (yeah, no Google back then, this was 1985 or so) and learned dynamic memory techniques. This was kind of a no brainer for me due to the fact that I already knew how to program in assembly language for an 8080 processor. Therefore, I knew all about memory addresses. The 8080 lacked a lot of features for sophisticated memory addressing, but it did use an H & L register combination to reference memory.
Anyway, back to pointers…
Uber-Simple Pointers
All right, a pointer is basically a memory address. That’s it. Nothing more. The trick is how to work with pointers. You could just put in a memory address and access the memory, but more than likely, the computer will crash or give you a memory fault. That’s because the computer uses memory for the operating system and memory to run other things, before your program has a chance to start running. Only the operating system knows what memory is in use at any given time. You have to ask the OS for memory, then the OS will give you an address to the location that you can use.
Here’s an example of a computer with 10 bytes of memory that is not in use:
Each number represents the address of that memory cell. To keep this simple, I’m going to ignore the amount of memory allocated at a time and assume that the OS only gives you one cell at a time. This would not be very useful in reality, but it’s just an example.
So now your program starts, and you are about to ask for some memory from the OS. This is what the memory starts out as:
The gray cells represent the memory cells in use by the OS and your program. You don’t really care about what memory is in use, because you just want a pointer to one memory cell that you can use. You call the “new” or the “malloc” command in C/C++ and you get a pointer to a cell of memory. A typical program in C++ would look like this:
#include <stdio.h> #include <stdlib.h> int main() { char * mypointer; mypointer = (char *)malloc(1); }
Now the character variable “mypointer” is not a character but an address or pointer to a character. The OS will look up a blank memory cell, mark it as “in use” and then give your program the address (by use of the malloc() procedure) which is stored in “mypointer”. So now your memory would look like this:
And your variable “mypointer” would contain the address 5. If you were to print the address by converting the pointer into an integer, you would see the actual address number. However, you don’t really care what the address number is because “mypointer” will keep that for you and now you can read and write to that address location. This process is called “allocating memory.” Now I’m going to add a line of code to put a character into address 5:
#include <stdio.h> #include <stdlib.h> int main() { char * mypointer; mypointer = (char *)malloc(1); *mypointer='a'; }
Notice the “*” before the variable “mypointer”? The star represents “location of” and means that we are putting the character “a” into the memory location that the variable “mypointer” contains. If you left the star off the beginning of the pointer address, then “mypointer” would be replaced with the ascii value of ‘a’ (technically the compiler should complain that it’s not a number). Now our memory looks like this:
All right. I’m almost to the end of the “uber-simple pointer” example. The final thing you must remember to do is give the memory back to the operating system when you’re finished with it. Remember, you’re just borrowing the memory, if you don’t give it back, then it can’t be used by any other program running on your computer. If you keep allocating and don’t give it back, then you’ll create what is called a “memory leak.” This is where the computer keeps running out of memory because your program is allocating it and not giving it back.
To “de-allocate” memory, you can call the “free” command:
#include <stdio.h> #include <stdlib.h> int main() { char * mypointer; mypointer = (char *)malloc(1); *mypointer='a'; free(mypointer); }
On last thing. When you start a program, set your new pointer variables to “null.” Null is a special memory address (usually zero), that is universally accepted as a pointer that is pointing to nothing. When you reference a pointer in a function and you’re not sure if it has been allocated yet, you can check to see if it is null first. Here’s an example:
#include <stdio.h> #include <stdlib.h> int main() { char * mypointer = NULL; if (!mypointer) mypointer = (char *)malloc(1); *mypointer='a'; if (mypointer) { free(mypointer); mypointer=NULL; } }
Notice how C will recognize a null pointer as a false variable. This example is so tiny that it’s difficult to see the use in performing such extra work, but a large program might have variables declared in a location that is nowhere near where the pointer is allocated. Using this technique will prevent you from referencing a memory location that is not allocated. Such bugs are very difficult to find, but a null pointer error is obvious. Also, notice how I re-assigned null to “mypointer” when I freed it up. If this pointer is referenced in code after being freed, then the null value will cause your program to bomb and tell you that you made a mistake and used a freed pointer.
Conclusion
Pointers can get tricky, but if you follow the basic rules that I laid about above, you should be able to work through dynamic structures with ease. In the future I’ll go over some very basic dynamic structures like linked lists and try to make it as simple and visual as possible. But for now, good luck with your programming.