Compilation Stages

This week I’m going to write about compiler, which compiles our programming code into an executable file. I believe most people have used an IDE ( Integrated Development Environment) before to compile and run their codes before, especially in the early stages of learning how to program. No doubt an IDE is an easy-to-use tool and it can certainly do magic tricks where we just have to input the source codes, then compile and lastly run it. Voila, the output that we expect will appear in a split second (given that the code is error-free…).

However, there are actually a few stages going on with the codes when we compile it. If I’m not maistaken, an IDE has taken care of all those intermediate processes. I remember I have read before recently that an IDE will not let the programmer learn thoroughly what is going on with the codes when they are being compiled.

In practice, most compilers perform all three steps of compiling, assembling, and linking. Finally, the loader loads the program into memory and starts execution.


Compiler (I/O: High-level Code / Assembly Code)

A compiler translates high-level code which most people use nowadays into assembly language. High-level codes are like those in C/C++ and etc, assembly language codes are MOV, ADD, SUB and so on. High-level-language programs can significantly reduce the lines of code compared to assembly language, so programmer productivity is much higher.

Assembler (I/O: Assembly Code / Object File)

The assembler turns the assembly language code into an object file containing machine language code. Machine language (1 and 0) which humans do not understand and it’s solely meant for the computer to decipher as it only speaks in 1 and 0. The assembler makes two passes through the assembly code. On the first pass, it assigns instruction addresses and finds all the symbols, such as labels and global variable names. On the second pass through the code, it produces the machine language code. Addresses for the global variables and labels are taken from the symbol table. The object file is a combination of machine language instructions, data, and information needed to place instructions properly in memory

Linker (I/O: Object File / Executable file)

The linker combines all of the object files into one machine language file called the executable. The linker uses the relocation information and symbol table in each object module to resolve all undefined labels.


Now that the executable file is on disk, the operating system reads it to memory and starts it.

You may also like...

Leave a Reply