1 * MCS: The Ximian C# compiler
3 MCS began as an experiment to learn the features of C# by
4 writing a large C# program. MCS is currently able to parse C#
5 programs and create an internal tree representation of the
6 program. MCS can parse itself.
8 MCS now does type checking at the class, interface and struct
9 levels and can resolve the class hierarchy and as of last week
10 can generate interface code.
12 Work is progressing quickly on various fronts in the C#
13 compiler. Recently I started using the System.Reflection API
14 to load system type definitions and avoid self-population of
15 types in the compiler and dropped my internal Type
16 representation in favor of using the CLI's System.Type.
18 ** Phases of the compiler
20 The compiler has a number of phases:
23 * Lexical analyzer: hand-coded lexical analyzer that
24 provides tokens to the parser.
26 * The Parser: the parser is implemented using Jay (A
27 Berkeley Yacc port to Java, that I ported to C#).
28 The parser does minimal work and syntax checking,
29 and only constructs a parsed tree.
31 Each language element gets its own class. The code
32 convention is to use an uppercase name for the
33 language element. So a C# class and its associated
34 information is kept in a "Class" class, a "struct"
35 in a "Struct" class and so on. Statements derive
36 from the "Statement" class, and Expressions from the
39 * Parent class resolution: before the actual code
40 generation, we need to resolve the parents and
41 interfaces for interface, classe and struct
44 * Semantic analysis: since C# can not resolve in a
45 top-down pass what identifiers actually mean, we
46 have to postpone this decision until the above steps
49 * Code generation: The compiler recently started generating IL
50 executables that contain interfaces. Work is
51 progressing in other areas.
53 The code generation is done through the System.Reflection.Emit API.
57 ** Current pending tasks
62 * Array declarations are currently being ignored,
64 * PInvoke declarations are not supported.
66 * Pre-processing is not supported.
68 * Attribute declarations and passing currently ignored.
70 * Compiler does not pass around line/col information from tokenizer for error reporting.
72 * Jay does not work correctly with `error'
73 productions, making parser errors hard to point. It
74 would be best to port the Bison-To-Java compiler to
75 become Bison-to-C# compiler (bjepson@oreilly.com
76 might have more information)
79 Interesting and Fun hacks to the compiler:
82 * Finishing the JB port from Java to C#. If you are
83 interested in working on this, please contact Brian
84 Jepson (bjepson at oreilly d-o-t com).
86 More on JB at: <a href="http://www.cs.colorado.edu/~dennis/software/jb.html">
87 http://www.cs.colorado.edu/~dennis/software/jb.html</a>
89 JB will allow us to move from the Berkeley Yacc
90 based Jay to a Bison-based compiler (better error
91 reporting and recovery).
93 * Semantic Analysis: Return path coverage and
94 initialization before use coverage are two great
95 features of C# that help reduce the number of bugs
96 in applications. It is one interesting hack.
98 * Enum resolutions: it is another fun hack, as enums can be defined
99 in terms of themselves (<tt>enum X { a = b + 1, b = 5 }</tt>).
103 ** Questions and Answers
105 Q: Why not write a C# front-end for GCC?
107 A: I wanted to learn about C#, and this was an exercise in this
108 task. The resulting compiler is highly object-oriented, which has
109 lead to a very nice, easy to follow and simple implementation of
112 I found that the design of this compiler is very similar to
113 Guavac's implementation.
115 Targeting the CIL/MSIL byte codes would require to re-architecting
116 GCC, as GCC is mostly designed to be used for register machines.
118 The GCC Java engine that generates Java byte codes cheats: it does
119 not use the GCC backend; it has a special backend just for Java, so
120 you can not really generate Java bytecodes from the other languages
123 Q: If your C# compiler is written in C#, how do you plan on getting
124 this working on a non-Microsoft environment.
126 We will do this through an implementation of the CLI Virtual
127 Execution System for Unix (our JIT engine).
131 A: No, currently I am using Jay which is a port of Berkeley Yacc to
132 Java that I later ported to C#. This means that error recovery is
133 not as nice as I would like to, and for some reason error
134 productions are not being caught.
136 In the future I want to port one of the Bison/Java ports to C# for
139 Q: Should someone work on a GCC front-end to C#?
141 A: I would love if someone does, and we would love to help anyone that
142 takes on that task, but we do not have the time or expertise to
143 build a C# compiler with the GCC engine. I find it a lot more fun
144 personally to work on C# on a C# compiler, which has an intrinsic
147 We can provide help and assistance to anyone who would like to work
150 Q: Should someone make a GCC backend that will generate CIL images?
152 A: I would love to see a backend to GCC that generates CIL images. It
153 would provide a ton of free compilers that would generate CIL
154 code. This is something that people would want to look into
155 anyways for Windows interoperation in the future.
157 Again, we would love to provide help and assistance to anyone
158 interested in working in such a project.
160 Q: What about making a front-end to GCC that takes CIL images and
161 generates native code?
163 A: I would love to see this, specially since GCC supports this same
164 feature for Java Byte Codes. You could use the metadata library
165 from Mono to read the byte codes (ie, this would be your
166 "front-end") and generate the trees that get passed to the
169 Ideally our implementation of the CLI will be available as a shared
170 library that could be linked with your application as its runtime
173 Again, we would love to provide help and assistance to anyone
174 interested in working in such a project.
176 Q: But would this work around the GPL in the GCC compiler and allow
177 people to work on non-free front-ends?
179 A: People can already do this by targeting the JVM byte codes (there
180 are about 130 compilers for various languages that target the JVM).
182 Q: Why are you writing a JIT engine instead of a front-end to GCC?
184 A: The JIT engine and runtime engine will be able to execute CIL
185 executables generated on Windows.
187 You might also want to look at the <a href="faq.html#gcc">GCC</a>
188 section on the main FAQ