X-Git-Url: http://wien.tomnetworks.com/gitweb/?a=blobdiff_plain;f=mcs%2Fdocs%2Fcompiler;h=bbdb95031902fe5f78ab8ac37249a0bf7c3bbe89;hb=25824cf4bfb818c38f8b799f2167a1bed824f6b9;hp=91ac498010776462aae14dbb0482b210da2a5c2f;hpb=e006f81d4261b311c111355f88ec470640056b70;p=mono.git diff --git a/mcs/docs/compiler b/mcs/docs/compiler index 91ac4980107..bbdb9503190 100755 --- a/mcs/docs/compiler +++ b/mcs/docs/compiler @@ -51,6 +51,153 @@ point are done at this point, and the output is saved to disk. + The following list will give you an idea of where the + different pieces of the compiler live: + + Infrastructure: + + driver.cs: + This drives the compilation process: loading of + command line options; parsing the inputs files; + loading the referenced assemblies; resolving the type + hierarchy and emitting the code. + + codegen.cs: + + The state tracking for code generation. + + attribute.cs: + + Code to do semantic analysis and emit the attributes + is here. + + rootcontext.cs: + + Keeps track of the types defined in the source code, + as well as the assemblies loaded. + + typemanager.cs: + + This contains the MCS type system. + + report.cs: + + Error and warning reporting methods. + + support.cs: + + Assorted utility functions used by the compiler. + + Parsing + + cs-tokenizer.cs: + + The tokenizer for the C# language, it includes also + the C# pre-processor. + + cs-parser.jay, cs-parser.cs: + + The parser is implemented using a C# port of the Yacc + parser. The parser lives in the cs-parser.jay file, + and cs-parser.cs is the generated parser. + + location.cs: + + The `location' structure is a compact representation + of a file, line, column where a token, or a high-level + construct appears. This is used to report errors. + + Expressions: + + ecore.cs + + Basic expression classes, and interfaces most shared + code and static methods are here. + + expression.cs: + + Most of the different kinds of expressions classes + live in this file. + + assign.cs: + + The assignment expression got its own file. + + constant.cs: + + The classes that represent the constant expressions. + + literal.cs + + Literals are constants that have been entered manually + in the source code, like `1' or `true'. The compiler + needs to tell constants from literals apart during the + compilation process, as literals sometimes have some + implicit extra conversions defined for them. + + cfold.cs: + + The constant folder for binary expressions. + + Statements + + statement.cs: + + All of the abstract syntax tree elements for + statements live in this file. This also drives the + semantic analysis process. + + iterators.cs: + + Contains the support for implementing iterators from + the C# 2.0 specification. + + Declarations, Classes, Structs, Enumerations + + decl.cs + + This contains the base class for Members and + Declaration Spaces. A declaration space introduces + new names in types, so classes, structs, delegates and + enumerations derive from it. + + class.cs: + + Methods for holding and defining class and struct + information, and every member that can be in these + (methods, fields, delegates, events, etc). + + The most interesting type here is the `TypeContainer' + which is a derivative of the `DeclSpace' + + delegate.cs: + + Handles delegate definition and use. + + enum.cs: + + Handles enumerations. + + interface.cs: + + Holds and defines interfaces. All the code related to + interface declaration lives here. + + parameter.cs: + + During the parsing process, the compiler encapsulates + parameters in the Parameter and Parameters classes. + These classes provide definition and resolution tools + for them. + + pending.cs: + + Routines to track pending implementations of abstract + methods and interfaces. These are used by the + TypeContainer-derived classes to track whether every + method required is implemented. + + * The parsing process All the input files that make up a program need to be read in @@ -72,7 +219,7 @@ At the time the assignment expression `a = "hello"' is parsed, it is not know whether a is a class field from this class, or its parents, or whether it is a property access or a variable - reference. The actual meaning of `a' will not be discvored + reference. The actual meaning of `a' will not be discovered until the semantic analysis phase. ** The Tokenizer and the pre-processor @@ -122,7 +269,17 @@ struct) that map each input source line to a linear number. As new files are parsed, the Location manager is informed of the new file, to allow it to map back from an int constant to - a file + line number. + a file + line number. + + Prior to parsing/tokenizing any source files, the compiler + generates a list of all the source files and then reserves the + low N bits of the location to hold the source file, where N is + large enough to hold at least twice as many source files as were + specified on the command line (to allow for a #line in each file). + The upper 32-N bits are the line number in that file. + + The token 0 is reserved for ``anonymous'' locations, ie. if we + don't know the location (Location.Null). The tokenizer also tracks the column number for a token, but this is currently not being used or encoded. It could @@ -157,6 +314,39 @@ ** Expressions + Expressions in the Mono C# compiler are represented by the + `Expression' class. This is an abstract class that particular + kinds of expressions have to inherit from and override a few + methods. + + The base Expression class contains two fields: `eclass' which + represents the "expression classification" (from the C# + specs) and the type of the expression. + + Expressions have to be resolved before they are can be used. + The resolution process is implemented by overriding the + `DoResolve' method. The DoResolve method has to set the + `eclass' field and the `type', perform all error checking and + computations that will be required for code generation at this + stage. + + The return value from DoResolve is an expression. Most of the + time an Expression derived class will return itself (return + this) when it will handle the emission of the code itself, or + it can return a new Expression. + + For example, the parser will create an "ElementAccess" class + for: + + a [0] = 1; + + During the resolution process, the compiler will know whether + this is an array access, or an indexer access. And will + return either an ArrayAccess expression or an IndexerAccess + expression from DoResolve. + + + *** The Expression Class The utility functions that can be called by all children of @@ -164,7 +354,7 @@ ** Constants - Constants in the Mono C# compiler are reprensented by the + Constants in the Mono C# compiler are represented by the abstract class `Constant'. Constant is in turn derived from Expression. The base constructor for `Constant' just sets the expression class to be an `ExprClass.Value', Constants are @@ -294,6 +484,16 @@ The value that is allowed to be returned or NULL if there is no return type. + * ReturnLabel + + A `Label' used by the code if it must jump to it. + This is used by a few routines that deals with exception + handling. + + * HasReturnLabel + + Whether we have a return label defined by the toplevel + driver. * ContainerType @@ -332,8 +532,44 @@ * InUnsafe Whether we are inside an unsafe block + + Methods exposed by the EmitContext: + + * EmitTopBlock() + + This emits a toplevel block. + + This routine is very simple, to allow the anonymous + method support to roll its two-stage version of this + routine on its own. + + * NeedReturnLabel (): + + This is used to flag during the resolution phase that + the driver needs to initialize the `ReturnLabel' + +* Anonymous Methods + + The introduction of anonymous methods in the compiler changed + various ways of doing things in the compiler. The most + significant one is the hard split between the resolution phase + and the emission phases of the compiler. + + For instance, routines that referenced local variables no + longer can safely create temporary variables during the + resolution phase: they must do so from the emission phase, + since the variable might have been "captured", hence access to + it can not be done with the local-variable operations from the runtime. + + The code emission is in: + + EmitTopBlock () + + Which drives the process, it first resolves the topblock, then + emits the required metadata (local variable definitions) and + finally emits the code. -* Miscelaneous +* Miscellaneous ** Error Processing. @@ -347,7 +583,7 @@ The error codes in the Mono C# compiler are the same as those found in the Microsoft C# compiler, with a few exceptions (where we report a few more errors, those are documented in - mcs/errors/errors.txt). The goal is to reduce confussion to + mcs/errors/errors.txt). The goal is to reduce confusion to the users, and also to help us track the progress of the compiler in terms of the errors we report. @@ -372,3 +608,17 @@ RootContext.WarningLevel in a few places to decide whether a warning is worth reporting to the user or not. +* Debugging the compiler + + Sometimes it is convenient to find *how* a particular error + message is being reported from, to do that, you might want to use + the --fatal flag to mcs. The flag will instruct the compiler to + abort with a stack trace execution when the error is reported. + + You can use this with -warnaserror to obtain the same effect + with warnings. + +* Editing the compiler sources + + The compiler sources are intended to be edited with 134 columns of width + \ No newline at end of file