2004-02-27 Miguel de Icaza <miguel@ximian.com>

[mono.git] / docs / mini-doc.txt
diff --git a/docs/mini-doc.txt b/docs/mini-doc.txt

index 4c6f57c42d0fc2756b51adf127b581e892b90d67..30d9b4442b7ad5cc8c24bb5c2f67e04f5ab88008 100644 (file)
--- a/docs/mini-doc.txt
+++ b/docs/mini-doc.txt
@@ -2,19 +2,28 @@
                A new JIT compiler for the Mono Project
  
            Miguel de Icaza (miguel@{ximian.com,gnome.org}),
+          Paolo Molaro (lupus@{ximian.com,debian.org})
  
    
  * Abstract
  
         Mini is a new compilation engine for the Mono runtime.  The
         new engine is designed to bring new code generation
-       optimizations, portability and precompilation. 
+       optimizations, portability and pre-compilation. 
  
         In this document we describe the design decisions and the
         architecture of the new compilation engine. 
  
  * Introduction
  
+       Mono is a Open Source implementation of the .NET Framework: it
+       is made up of a runtime engine that implements the ECMA Common
+       Language Infrastructure (CLI), a set of compilers that target
+       the CLI and a large collection of class libraries.
+
+       This article discusses the new code generation facilities that
+       have been added to the Mono runtime.  
+
         First we discuss the overall architecture of the Mono runtime,
         and how code generation fits into it; Then we discuss the
         development and basic architecture of our first JIT compiler
@@ -119,7 +128,7 @@
          inferred. 
  
         At this point the JIT would pass the constructed forest of
-       trees to the architecture-dependant JIT compiler.  
+       trees to the architecture-dependent JIT compiler.  
  
         The architecture dependent code then performed register
         allocation (optionally using linear scan allocation for
@@ -496,7 +505,7 @@
         The difference is on the set of optimizations that are turned
         on for each mode: Just-in-Time compilation should be as fast
         as possible, while Ahead-of-Time compilation can take as long
-       as required, because this is not done at a time criticial
+       as required, because this is not done at a time critical
         time. 
  
         With AOT compilation, we can afford to turn all of the
@@ -509,7 +518,7 @@
         assembler, which generates a loadable module.
  
         At execution time, when an assembly is loaded from the disk,
-       the runtime engine will probe for the existance of a
+       the runtime engine will probe for the existence of a
         pre-compiled image.  If the pre-compiled image exists, then it
         is loaded, and the method invocations are resolved to the code
         contained in the loaded module.
@@ -584,7 +593,7 @@
  
          2) liveness information for the variables
  
-        3) (optionally) loop info to favour variables that are used in
+        3) (optionally) loop info to favor variables that are used in
          inner loops.
  
         During instruction selection phase, symbolic registers are
@@ -597,19 +606,6 @@
         registers, fixed registers and clobbered registers by each
         operation.
  
-
-----------
-* Bootstrap 
-
-       The Mini bootstrap parses the arguments passed on the command
-       line, and initializes the JIT runtime. Each time the
-       mini_init() routine is invoked, a new Application Domain will
-       be returned.
-
-* Signal handlers
-
-       mono_runtime_install_handlers
-
  * BURG Code Generator Generator
  
         monoburg was written by Dietmar Maurer. It is based on the
@@ -624,6 +620,120 @@
         JIT. This simplifies the code because we can directly pass DAGs and
         don't need to convert them to trees.
  
+* Adding IL opcodes: an excercise (from a post by Paolo Molaro)
+
+       mini.c is the file that read the IL code stream and decides
+       how any single IL instruction is implemented
+       (mono_method_to_ir () func), so you always have to add an
+       entry to the big switch inside the function: there are plenty
+       of examples in that file.
+
+       An IL opcode can be implemented in a number of ways, depending
+       on what it does and how it needs to do it.
+       
+       Some opcodes are implemented using a helper function: one of
+       the simpler examples is the CEE_STELEM_REF implementation.
+
+       In this case the opcode implementation is written in a C
+       function.  You will need to register the function with the jit
+       before you can use it (mono_register_jit_call) and you need to
+       emit the call to the helper using the mono_emit_jit_icall()
+       function.  
+
+       This is the simpler way to add a new opcode and it doesn't
+       require any arch-specific change (though it's limited to what
+       you can do in C code and the performance may be limited by the
+       function call).
+       
+       Other opcodes can be implemented with one or more of the already
+       implemented low-level instructions. 
+
+       An example is the OP_STRLEN opcode which implements
+       String.Length using a simple load from memory.  In this case
+       you need to add a rule to the appropriate burg file,
+       describing what are the arguments of the opcode and what is,
+       if any, it's 'return' value.
+
+       The OP_STRLEN case is:
+       
+       reg: OP_STRLEN (reg) {  
+               MONO_EMIT_LOAD_MEMBASE_OP (s, tree, OP_LOADI4_MEMBASE, state->reg1, 
+                       state->left->reg1, G_STRUCT_OFFSET (MonoString, length));
+       }
+
+       The above means: the OP_STRLEN takes a register as an argument
+       and returns its value in a register.  And the implementation
+       of this is included in the braces.
+       
+       The opcode returns a value in an integer register
+       (state->reg1) by performing a int32 load of the length field
+       of the MonoString represented by the input register
+       (state->left->reg1): before the burg rules are applied, the
+       internal representation is based on trees, so you get the
+       left/right pointers (state->left and state->right
+       respectively, the result is stored in state->reg1).
+
+       This instruction implementation doesn't require arch-specific
+       changes (it is using the MONO_EMIT_LOAD_MEMBASE_OP which is
+       available on all platforms), and usually the produced code is
+       fast.
+       
+       Next we have opcodes that must be implemented with new low-level
+       architecture specific instructions (either because of performance
+       considerations or because the functionality can't get implemented in
+       other ways).  
+
+       You also need a burg rule in this case, too. For example,
+       consider the OP_CHECK_THIS opcode (used to raise an exception
+       if the this pointer is null). The burg rule simply reads:
+       
+       stmt: OP_CHECK_THIS (reg) {
+               mono_bblock_add_inst (s->cbb, tree);
+       }
+       
+       Note that this opcode does not return a value (hence the
+       "stmt") and it takes a register as input.
+
+       mono_bblock_add_inst (s->cbb, tree) just adds the instruction
+       (the tree variable) to the current basic block (s->cbb). In
+       mini this is the place where the internal representation
+       switches from the tree format to the low-level format (the
+       list of simple instructions).
+
+       In this case the actual opcode implementation is delegated to
+       the arch-specific code.  A low-level opcode needs an entry in
+       the machine description (the *.md files in mini/). This entry
+       describes what kind of registers are used if any by the
+       instruction, as well as other details such as constraints or
+       other hints to the low-level engine which are architecture
+       specific.  
+
+       cpu-pentium.md, for example has the following entry:
+       
+       checkthis: src1:b len:3
+       
+       This means the instruction uses an integer register as a base
+       pointer (basically a load or store is done on it) and it takes
+       3 bytes of native code to implement it.
+
+       Now you just need to provide the low-level implementation for
+       the opcode in one of the mini-$arch.c files, in the
+       mono_arch_output_basic_block() function. There is a big switch
+       here too. The x86 implementation is:
+
+               case OP_CHECK_THIS:
+                       /* ensure ins->sreg1 is not NULL */
+                       x86_alu_membase_imm (code, X86_CMP, ins->sreg1, 0, 0);
+                       break;
+       
+       If the $arch-codegen.h header file doesn't have the code to
+       emit the low-level native code, you'll need to write that as
+       well.  
+
+       Complex opcodes with register constraints may require other
+       changes to the local register allocator, but usually they are
+       not needed.
+               
  * Future
  
          Profile-based optimization is something that we are very
@@ -655,4 +765,4 @@
         processors, and some of the framework exists today in our
         register allocator and the instruction selector to cope with
         this, but has not been finished.  The instruction selection
-       would happen at the same time as local register allocation. 
-\ No newline at end of file
+       would happen at the same time as local register allocation. <
+\ No newline at end of file