* MCS: The Ximian C# compiler MCS began as an experiment to learn the features of C# by writing a large C# program. MCS is currently able to parse C# programs and create an internal tree representation of the program. MCS can parse itself. Work is progressing quickly on various fronts in the C# compiler. Recently I started using the System.Reflection API to load system type definitions and avoid self-population of types in the compiler and dropped my internal Type representation in favor of using .NET's System.Type. ** Phases of the compiler The compiler has a number of phases: * Lexical analyzer: hand-coded lexical analyzer that provides tokens to the parser. * The Parser: the parser is implemented using Jay (A Berkeley Yacc port to Java, that I ported to C#). The parser does minimal work and syntax checking, and only constructs a parsed tree. Each language element gets its own class. The code convention is to use an uppercase name for the language element. So a C# class and its associated information is kept in a "Class" class, a "struct" in a "Struct" class and so on. Statements derive from the "Statement" class, and Expressions from the Expr class. * Parent class resolution: before the actual code generation, we need to resolve the parents and interfaces for interface, classe and struct definitions. * Semantic analysis: since C# can not resolve in a top-down pass what identifiers actually mean, we have to postpone this decision until the above steps are finished. * Code generation: nothing done so far, but I do not expect this to be hard, as I will just use System.Reflection.Emit to generate the code. ** Current pending tasks Array declarations are currently being ignored, PInvoke is not supported. Pre-processing is not supported. Attribute declarations and passing currently ignored. Compiler does not pass around line/col information from tokenizer for error reporting. Jay does not work correctly with `error' productions, making parser errors hard to point. ** Questions and Answers Q: Why not write a C# front-end for GCC? A: I wanted to learn about C#, and this was an exercise in this task. The resulting compiler is highly object-oriented, which has lead to a very nice, easy to follow and simple implementation of the compiler. I found that the design of this compiler is very similar to Guavac's implementation. Targeting the CIL/MSIL byte codes would require to re-architecting GCC, as GCC is mostly designed to be used for register machines. The GCC Java engine that generates Java byte codes cheats: it does not use the GCC backend; it has a special backend just for Java, so you can not really generate Java bytecodes from the other languages supported by GCC. Q: If your C# compiler is written in C#, how do you plan on getting this working on a non-Microsoft environment. The compiler will have two output mechanisms: IL code or C code. A compiled version of the compiler could be run on Unix using the JIT runtime. The C output generation bit is just intended to be a temporary measure to allow Unix hackers to contribute to the effort without requiring Windows and Microsoft's .NET implementation to work on the compiler. So the MCS C# compiler will compile itself to C, this code then compiled to an executable on Unix and voila! We have a native compiler for GNU/Linux. Q: Do you use Bison? A: No, currently I am using Jay which is a port of Berkeley Yacc to Java that I later ported to C#. This means that error recovery is not as nice as I would like to, and for some reason error productions are not being caught. In the future I want to port one of the Bison/Java ports to C# for the parser. Q: How do I compile it? A: Compiling MCS currently requires you to run my port of Jay to C# on a Unix system to generate the parser, and then you need to use Microsoft's .NET csc.exe compiler to compile the compiler. You only need to compile the compiler compiler (C code), the samples are Java samples that I did not port, and you do not need them. It might be simple to port Jay.cs to Windows, but I have not tried this.