Well, tcc is one of the fastest C compilers when we speak about *compilation speed*. However, this is achieved by generating code almost simultaneously with syntax analysis, without intermediate steps that use some other compilers. Compilers like gcc do the following:
- Perform syntax analysis and generate abstract syntax tree
- Perform semantic analysis over the tree
- Optimize the tree
- Generate intermediate code
- Optimize intermediate code
- Generate assembly code
- Optimize assembly code
- Generate machine code
- Link to executable using external linker
Tcc does not generate intermediate tree and platform neutral code. It has workflow more similar to the following
- While analysing the syntax check the semantics and generate the required tables. If the syntax is correct, then generate machine code. immediately
- Use internal linker and link with simple startup routine
The result is much faster compilation and often shorter executable due to simple startup routine.
However, this also means that tcc can not be the fastest C compiler when we speak about *execution speed*. For example. it is impossible to detect dead code. Look at this code
a=200;
b=10;
a=b;
Remark that first instruction is completely useless. The compilers who generate intermediate formats can detect it and remove this instruction from generated code.
Similarly, sometimes variables do not need to be stored in memory like this loop counter "i":
for (int i=0;i<n;i++) pow=pow*b;
After we remarked that the loop counter is used just to repeat multiplication n times, it is possible to generate machine code LOOP instruction with register counter, which is quite fast. Unfortunately, the compiler can not remark it without analysing the whole source file before generating assembly code.
Should we make tcc to perform more like gcc and generate faster code? No! Adding intermediate formats and infinite optimizations will nullify two good tcc properties, fast compilation and small compiler size. However, I would like to see some simple code generating improvements, which are possible without changes in compiler design. For example, operations between variable and constant like a=5; a>5; a+5 etc on 80386 can be done using one instruction while tcc generates two instructions.