point are done at this point, and the output is saved to
disk.
+ The following list will give you an idea of where the
+ different pieces of the compiler live:
+
+ Infrastructure:
+
+ driver.cs:
+ This drives the compilation process: loading of
+ command line options; parsing the inputs files;
+ loading the referenced assemblies; resolving the type
+ hierarchy and emitting the code.
+
+ codegen.cs:
+
+ The state tracking for code generation.
+
+ attribute.cs:
+
+ Code to do semantic analysis and emit the attributes
+ is here.
+
+ rootcontext.cs:
+
+ Keeps track of the types defined in the source code,
+ as well as the assemblies loaded.
+
+ typemanager.cs:
+
+ This contains the MCS type system.
+
+ report.cs:
+
+ Error and warning reporting methods.
+
+ support.cs:
+
+ Assorted utility functions used by the compiler.
+
+ Parsing
+
+ cs-tokenizer.cs:
+
+ The tokenizer for the C# language, it includes also
+ the C# pre-processor.
+
+ cs-parser.jay, cs-parser.cs:
+
+ The parser is implemented using a C# port of the Yacc
+ parser. The parser lives in the cs-parser.jay file,
+ and cs-parser.cs is the generated parser.
+
+ location.cs:
+
+ The `location' structure is a compact representation
+ of a file, line, column where a token, or a high-level
+ construct appears. This is used to report errors.
+
+ Expressions:
+
+ ecore.cs
+
+ Basic expression classes, and interfaces most shared
+ code and static methods are here.
+
+ expression.cs:
+
+ Most of the different kinds of expressions classes
+ live in this file.
+
+ assign.cs:
+
+ The assignment expression got its own file.
+
+ constant.cs:
+
+ The classes that represent the constant expressions.
+
+ literal.cs
+
+ Literals are constants that have been entered manually
+ in the source code, like `1' or `true'. The compiler
+ needs to tell constants from literals apart during the
+ compilation process, as literals sometimes have some
+ implicit extra conversions defined for them.
+
+ cfold.cs:
+
+ The constant folder for binary expressions.
+
+ Statements
+
+ statement.cs:
+
+ All of the abstract syntax tree elements for
+ statements live in this file. This also drives the
+ semantic analysis process.
+
+ iterators.cs:
+
+ Contains the support for implementing iterators from
+ the C# 2.0 specification.
+
+ Declarations, Classes, Structs, Enumerations
+
+ decl.cs
+
+ This contains the base class for Members and
+ Declaration Spaces. A declaration space introduces
+ new names in types, so classes, structs, delegates and
+ enumerations derive from it.
+
+ class.cs:
+
+ Methods for holding and defining class and struct
+ information, and every member that can be in these
+ (methods, fields, delegates, events, etc).
+
+ The most interesting type here is the `TypeContainer'
+ which is a derivative of the `DeclSpace'
+
+ delegate.cs:
+
+ Handles delegate definition and use.
+
+ enum.cs:
+
+ Handles enumerations.
+
+ interface.cs:
+
+ Holds and defines interfaces. All the code related to
+ interface declaration lives here.
+
+ parameter.cs:
+
+ During the parsing process, the compiler encapsulates
+ parameters in the Parameter and Parameters classes.
+ These classes provide definition and resolution tools
+ for them.
+
+ pending.cs:
+
+ Routines to track pending implementations of abstract
+ methods and interfaces. These are used by the
+ TypeContainer-derived classes to track whether every
+ method required is implemented.
+
+
* The parsing process
All the input files that make up a program need to be read in
At the time the assignment expression `a = "hello"' is parsed,
it is not know whether a is a class field from this class, or
its parents, or whether it is a property access or a variable
- reference. The actual meaning of `a' will not be discvored
+ reference. The actual meaning of `a' will not be discovered
until the semantic analysis phase.
** The Tokenizer and the pre-processor
struct) that map each input source line to a linear number.
As new files are parsed, the Location manager is informed of
the new file, to allow it to map back from an int constant to
- a file + line number.
+ a file + line number.
+
+ Prior to parsing/tokenizing any source files, the compiler
+ generates a list of all the source files and then reserves the
+ low N bits of the location to hold the source file, where N is
+ large enough to hold at least twice as many source files as were
+ specified on the command line (to allow for a #line in each file).
+ The upper 32-N bits are the line number in that file.
+
+ The token 0 is reserved for ``anonymous'' locations, ie. if we
+ don't know the location (Location.Null).
The tokenizer also tracks the column number for a token, but
this is currently not being used or encoded. It could
** Expressions
+ Expressions in the Mono C# compiler are represented by the
+ `Expression' class. This is an abstract class that particular
+ kinds of expressions have to inherit from and override a few
+ methods.
+
+ The base Expression class contains two fields: `eclass' which
+ represents the "expression classification" (from the C#
+ specs) and the type of the expression.
+
+ Expressions have to be resolved before they are can be used.
+ The resolution process is implemented by overriding the
+ `DoResolve' method. The DoResolve method has to set the
+ `eclass' field and the `type', perform all error checking and
+ computations that will be required for code generation at this
+ stage.
+
+ The return value from DoResolve is an expression. Most of the
+ time an Expression derived class will return itself (return
+ this) when it will handle the emission of the code itself, or
+ it can return a new Expression.
+
+ For example, the parser will create an "ElementAccess" class
+ for:
+
+ a [0] = 1;
+
+ During the resolution process, the compiler will know whether
+ this is an array access, or an indexer access. And will
+ return either an ArrayAccess expression or an IndexerAccess
+ expression from DoResolve.
+
+
+
*** The Expression Class
The utility functions that can be called by all children of
** Constants
- Constants in the Mono C# compiler are reprensented by the
+ Constants in the Mono C# compiler are represented by the
abstract class `Constant'. Constant is in turn derived from
Expression. The base constructor for `Constant' just sets the
expression class to be an `ExprClass.Value', Constants are
The value that is allowed to be returned or NULL if
there is no return type.
+ * ReturnLabel
+
+ A `Label' used by the code if it must jump to it.
+ This is used by a few routines that deals with exception
+ handling.
+
+ * HasReturnLabel
+
+ Whether we have a return label defined by the toplevel
+ driver.
* ContainerType
* InUnsafe
Whether we are inside an unsafe block
+
+ Methods exposed by the EmitContext:
+
+ * EmitTopBlock()
+
+ This emits a toplevel block.
+
+ This routine is very simple, to allow the anonymous
+ method support to roll its two-stage version of this
+ routine on its own.
+
+ * NeedReturnLabel ():
+
+ This is used to flag during the resolution phase that
+ the driver needs to initialize the `ReturnLabel'
+
+* Anonymous Methods
+
+ The introduction of anonymous methods in the compiler changed
+ various ways of doing things in the compiler. The most
+ significant one is the hard split between the resolution phase
+ and the emission phases of the compiler.
+
+ For instance, routines that referenced local variables no
+ longer can safely create temporary variables during the
+ resolution phase: they must do so from the emission phase,
+ since the variable might have been "captured", hence access to
+ it can not be done with the local-variable operations from the runtime.
+
+ The code emission is in:
+
+ EmitTopBlock ()
+
+ Which drives the process, it first resolves the topblock, then
+ emits the required metadata (local variable definitions) and
+ finally emits the code.
-* Miscelaneous
+* Miscellaneous
** Error Processing.
The error codes in the Mono C# compiler are the same as those
found in the Microsoft C# compiler, with a few exceptions
(where we report a few more errors, those are documented in
- mcs/errors/errors.txt). The goal is to reduce confussion to
+ mcs/errors/errors.txt). The goal is to reduce confusion to
the users, and also to help us track the progress of the
compiler in terms of the errors we report.
RootContext.WarningLevel in a few places to decide whether a
warning is worth reporting to the user or not.
+* Debugging the compiler
+
+ Sometimes it is convenient to find *how* a particular error
+ message is being reported from, to do that, you might want to use
+ the --fatal flag to mcs. The flag will instruct the compiler to
+ abort with a stack trace execution when the error is reported.
+
+ You can use this with -warnaserror to obtain the same effect
+ with warnings.
+
+* Editing the compiler sources
+
+ The compiler sources are intended to be edited with 134 columns of width
+
\ No newline at end of file