This week I gained a deeper and clearer understanding about some terminologies and the flow within a compiler that I have posted last week.
Frontend
Basically it uses a parser to produce the syntax tree abstraction of a given source file. It will translate source codes into intermediate representation (IR) like GENERIC, GIMPLE, and RTL that are used in GCC.
GENERIC
GENERIC is a common representation that is able to represent programs written in all the languages supported by GCC. It is a language-independent tree structure that is generated by frontend, which is used as a “middle end” while compiling source code into executable binaries. It is produced by eliminating language-specific constructs from the parse tree that is generated from the code. It is simply used simply to provide a language-independent way of representing an entire function in trees.
GIMPLE
A simplified subset of GENERIC for use in optimization, converted from GENERIC by “gimplifier” based on tree data structure. It is produced by simplifying address references within the code into three-address representation. At present, there are only two kinds of GIMPLE:
-
High level GIMPLE : what the middle-end produces when it lowers the GENERIC language that is targeted by all the language front ends.
-
Low level GIMPLE : obtained by linearizing all the high-level control flow structures of high level GIMPLE, including nested functions, exception handling, and loops.
-
SSA GIMPLE : low level GIMPLE rewritten in SSA form.
RTL (Register Transfer Language)
A very low level intermediate representation used in the backends of GCC that is very close to assembly language.
Backend
Works with IR to produce code in a computer output language
The following is the flow from source code, to compiler and generating assembly file.
- C/C++
- Frontend
- GENERIC
-
GIMPLE
-
RTL
-
Assembly File
Generation of dump files
I have found that we can generate output of each pass with an argument. This the general argument:
-fdump-ir-passname
where <ir> can be tree
for intraprocedural passes on GIMPLE, ipa
for nterprocedural passes on GIMPLE, rtl
for intraprocedural passes on RTL. Whereas <passname> can be either all
to see all dumps, ssa
for static single assignment, or gimple
.
0 Comments