The Compiler's Working Process
Category Programming Techniques
Source code must be converted into binary machine code to run. This is the task of the compiler.
For example, the following source code (assuming the file is named test.c).
#include <stdio.h>
int main(void)
{
fputs("Hello, world!\n", stdout);
return 0;
}
It must be processed by the compiler before it can run.
$ gcc test.c
$ ./a.out
Hello, world!
For complex projects, the compilation process must also be divided into three steps.
$ ./configure
$ make
$ make install
What exactly are these commands doing? Most books and materials are vague, only saying that this is how you compile, without further explanation.
This article will introduce the working process of the compiler, that is, the respective tasks of the above three commands. I mainly refer to Alex Smith's article "Building C Projects". It needs to be stated that this article mainly targets the gcc compiler, that is, for C and C++, and may not be applicable to compilers for other languages.
The First Step Configuration (configure)
Before starting work, the compiler needs to know the current system environment, such as where the standard library is, where the software installation location is, which components need to be installed, etc. This is because the system environment of different computers is different. By specifying compilation parameters, the compiler can flexibly adapt to the environment and compile machine code that can run in various environments. This step of determining compilation parameters is called "configuration" (configure).
These configuration information is saved in a configuration file, which is conventionally a script file called configure. It is usually generated by the autoconf tool. The compiler runs this script to obtain compilation parameters.
The configure script has taken into account the differences of different systems as much as possible and has given default values for various compilation parameters. If the user's system environment is relatively special or there are some specific requirements, it is necessary to manually provide compilation parameters to the configure script.
$ ./configure --prefix=/www --with-mysql
The above code is a compilation configuration of php source code, where the user specifies that the files after installation are saved in the www directory, and the compilation includes support for the mysql module.
The Second Step Determine the Location of Standard Libraries and Header Files
The source code will definitely use standard library functions (standard library) and header files (header). They can be stored in any directory of the system, and the compiler actually can't automatically detect their location, only through the configuration file can it be known.
The second step of compilation is to know the location of standard libraries and header files from the configuration file. Generally speaking, the configuration file will give a list, listing a few specific directories. When it comes to compilation, the compiler will go to these directories in order to find the target.
The Third Step Determine the Dependency Relationship
For large projects, there is often a dependency relationship between source code files, and the compiler needs to determine the order of compilation. Assuming that file A depends on file B, the compiler should ensure the following two points.
>
(1) Only after file B is compiled, start compiling file A.
(2) When file B changes, file A will be recompiled.
The compilation order is saved in a file called makefile, which lists which file is compiled first and which file is compiled later. And the makefile is generated by running the configure script, which is why the configure must be run first when compiling.
While determining the dependency relationship, the compiler also determines which header files will be used during compilation.
The Fourth Step Precompilation of Header Files (precompilation)
Different source code files may refer to the same header file (such as stdio.h). When compiling, the header file must also be compiled together. To save time, the compiler will compile the header file before compiling the source code. This ensures that the header file only needs to be compiled once, and does not need to be recompiled every time it is used.
However, not all content of the header file will be precompiled. The #define command used to declare macros will not be precompiled.
The Fifth Step Preprocessing
After the precompilation is completed, the compiler begins to replace the header files and macros in the source code with bash. Take the source code at the beginning of this article as an example, it includes the header file stdio.h, and the replaced appearance is as follows.
extern int fputs(const char *, FILE *);
extern FILE *stdout;
int main(void)
{
fputs("Hello, world!\n", stdout);
return 0;
}
To facilitate reading, the above code only takes the part of the header file related to the