X-Git-Url: http://wien.tomnetworks.com/gitweb/?a=blobdiff_plain;f=docs%2Fmini-porting.txt;h=25be0a775665700e4144717a60be84635f938c60;hb=4347c57e9783cf9348af2e0ec773024ab6d9b9fc;hp=75af68c1f8314a16605c971d4404f7dedbd67f29;hpb=c2e395049f6788103a10131904568ee8ae43652a;p=mono.git diff --git a/docs/mini-porting.txt b/docs/mini-porting.txt index 75af68c1f83..25be0a77566 100644 --- a/docs/mini-porting.txt +++ b/docs/mini-porting.txt @@ -1,338 +1,424 @@ - Mono JIT porting guide. - Paolo Molaro (lupus@ximian.com) + Mono JIT porting guide. + Paolo Molaro (lupus@ximian.com) * Introduction -This documents describes the process of porting the mono JIT -to a new CPU architecture. The new mono JIT has been designed -to make porting easier though at the same time enable the port -to take full advantage from the new architecture features and -instructions. Knowledge of the mini architecture (described in the -mini-doc.txt file) is a requirement for understanding this guide, -as well as an earlier document about porting the mono interpreter -(available on the web site). - -There are six main areas that a port needs to implement to -have a fully-functional JIT for a given architecture: - - 1) instruction selection - 2) native code emission - 3) call conventions and register allocation - 4) method trampolines - 5) exception handling - 6) minor helper methods - -To take advantage of some not-so-common processor features (for example -conditional execution of instructions as may be found on ARM or ia64), it may -be needed to develop an high-level optimization, but doing so is not a -requirement for getting the JIT to work. - -We'll see in more details each of the steps required, note, though, -that a new port may just as well start from a cut&paste of an existing -port to a similar architecture (for example from x86 to amd64, or from -powerpc to sparc). -The architecture specific code is split from the rest of the JIT, -for example the x86 specific code and data is all included in the -following files in the distribution: - - mini-x86.h mini-x86.c - inssel-x86.brg - cpu-pentium.md - tramp-x86.c - exceptions-x86.c - -I suggest a similar split for other architectures as well. - -Note that this document is still incomplete: some sections are only -sketched and some are missing, but the important info to get a port -going is already described. + This documents describes the process of porting the mono JIT + to a new CPU architecture. The new mono JIT has been designed + to make porting easier though at the same time enable the port + to take full advantage from the new architecture features and + instructions. Knowledge of the mini architecture (described in + the mini-doc.txt file) is a requirement for understanding this + guide, as well as an earlier document about porting the mono + interpreter (available on the web site). + + There are six main areas that a port needs to implement to + have a fully-functional JIT for a given architecture: + + 1) instruction selection + 2) native code emission + 3) call conventions and register allocation + 4) method trampolines + 5) exception handling + 6) minor helper methods + + To take advantage of some not-so-common processor features + (for example conditional execution of instructions as may be + found on ARM or ia64), it may be needed to develop an + high-level optimization, but doing so is not a requirement for + getting the JIT to work. + + We'll see in more details each of the steps required, note, + though, that a new port may just as well start from a + cut&paste of an existing port to a similar architecture (for + example from x86 to amd64, or from powerpc to sparc). + + The architecture specific code is split from the rest of the + JIT, for example the x86 specific code and data is all + included in the following files in the distribution: + + mini-x86.h mini-x86.c + inssel-x86.brg + cpu-pentium.md + tramp-x86.c + exceptions-x86.c + + I suggest a similar split for other architectures as well. + + Note that this document is still incomplete: some sections are + only sketched and some are missing, but the important info to + get a port going is already described. * Architecture-specific instructions and instruction selection. -The JIT already provides a set of instructions that can be easily -mapped to a great variety of different processor instructions. -Sometimes it may be necessary or advisable to add a new instruction -that represent more closely an instruction in the architecture. -Note that a mini instruction can be used to represent also a short -sequence of CPU low-level instructions, but note that each -instruction represents the minimum amount of code the instruction -scheduler will handle (i.e., the scheduler won't schedule the instructions -that compose the low-level sequence as individual instructions, but just -the whole sequence, as an indivisible block). -New instructions are created by adding a line in the mini-ops.h file, -assigning an opcode and a name. To specify the input and output for -the instruction, there are two different places, depending on the context -in which the instruction gets used. -If the instruction is used in the tree representation, the input and output -types are defined by the BURG rules in the *.brg files (the usual -non-terminals are 'reg' to represent a normal register, 'lreg' to -represent a register or two that hold a 64 bit value, freg for a -floating point register). -If an instruction is used as a low-level CPU instruction, the info -is specified in a machine description file. The description file is -processed by the genmdesc program to provide a data structure that -can be easily used from C code to query the needed info about the -instruction. -As an example, let's consider the add instruction for both x86 and ppc: - -x86 version: - add: dest:i src1:i src2:i len:2 clob:1 -ppc version: - add: dest:i src1:i src2:i len:4 - -Note that the instruction takes two input integer registers on both CPU, -but on x86 the first source register is clobbered (clob:1) and the length -in bytes of the instruction differs. -Note that integer adds and floating point adds use different opcodes, unlike -the IL language (64 bit add is done with two instructions on 32 bit architectures, -using a add that sets the carry and an add with carry). -A specific CPU port may assign any meaning to the clob field for an instruction -since the value will be processed in an arch-specific file anyway. -See the top of the existing cpu-pentium.md file for more info on other fields: -the info may or may not be applicable to a different CPU, in this latter case -the info can be ignored. -The code in mini.c together with the BURG rules in inssel.brg, inssel-float.brg -and inssel-long32.brg provides general purpose mappings from the tree representation -to a set of instructions that should be easily implemented in any architecture. -To allow for additional arch-specific functionality, an arch-specific BURG file -can be used: in this file arch-specific instructions can be selected that provide -better performance than the general instructions or that provide functionality -that is needed by the JIT but that cannot be expressed in a general enough way. -As an example, x86 has the special instruction "push" to make it easier to -implement the default call convention (passing arguments on the stack): almost -all the other architectures don't have such an instruction (and don't need it anyway), -so we added a special rule in the inssel-x86.brg file for it. - -So, one of the first things needed in a port is to write a cpu-$(arch).md machine -description file and fill it with the needed info. As a start, only a few -instructions can be specified, like the ones required to do simple integer -operations. The default rules of the instruction selector will emit the common -instructions and so we're ready to go for the next step in porting the JIT. - + The JIT already provides a set of instructions that can be + easily mapped to a great variety of different processor + instructions. Sometimes it may be necessary or advisable to + add a new instruction that represent more closely an + instruction in the architecture. Note that a mini instruction + can be used to represent also a short sequence of CPU + low-level instructions, but note that each instruction + represents the minimum amount of code the instruction + scheduler will handle (i.e., the scheduler won't schedule the + instructions that compose the low-level sequence as individual + instructions, but just the whole sequence, as an indivisible + block). + + New instructions are created by adding a line in the + mini-ops.h file, assigning an opcode and a name. To specify + the input and output for the instruction, there are two + different places, depending on the context in which the + instruction gets used. + + If the instruction is used in the tree representation, the + input and output types are defined by the BURG rules in the + *.brg files (the usual non-terminals are 'reg' to represent a + normal register, 'lreg' to represent a register or two that + hold a 64 bit value, freg for a floating point register). + + If an instruction is used as a low-level CPU instruction, the + info is specified in a machine description file. The + description file is processed by the genmdesc program to + provide a data structure that can be easily used from C code + to query the needed info about the instruction. + + As an example, let's consider the add instruction for both x86 + and ppc: + + x86 version: + add: dest:i src1:i src2:i len:2 clob:1 + ppc version: + add: dest:i src1:i src2:i len:4 + + Note that the instruction takes two input integer registers on + both CPU, but on x86 the first source register is clobbered + (clob:1) and the length in bytes of the instruction differs. + + Note that integer adds and floating point adds use different + opcodes, unlike the IL language (64 bit add is done with two + instructions on 32 bit architectures, using a add that sets + the carry and an add with carry). + + A specific CPU port may assign any meaning to the clob field + for an instruction since the value will be processed in an + arch-specific file anyway. + + See the top of the existing cpu-pentium.md file for more info + on other fields: the info may or may not be applicable to a + different CPU, in this latter case the info can be ignored. + + The code in mini.c together with the BURG rules in inssel.brg, + inssel-float.brg and inssel-long32.brg provides general + purpose mappings from the tree representation to a set of + instructions that should be easily implemented in any + architecture. To allow for additional arch-specific + functionality, an arch-specific BURG file can be used: in this + file arch-specific instructions can be selected that provide + better performance than the general instructions or that + provide functionality that is needed by the JIT but that + cannot be expressed in a general enough way. + + As an example, x86 has the special instruction "push" to make + it easier to implement the default call convention (passing + arguments on the stack): almost all the other architectures + don't have such an instruction (and don't need it anyway), so + we added a special rule in the inssel-x86.brg file for it. + + So, one of the first things needed in a port is to write a + cpu-$(arch).md machine description file and fill it with the + needed info. As a start, only a few instructions can be + specified, like the ones required to do simple integer + operations. The default rules of the instruction selector will + emit the common instructions and so we're ready to go for the + next step in porting the JIT. + *) Native code emission -Since the first step in porting mono to a new CPU is to port the interpreter, -there should be already a file that allows the emission of binary native code -in a buffer for the architecture. This file should be placed in the - mono/arch/$(arch)/ -directory. - -The bulk of the code emission happens in the mini-$(arch).c file, in a function -called mono_arch_output_basic_block (). This function takes a basic block, walks the -list of instructions in the block and emits the binary code for each. -Optionally a peephole optimization pass is done on the basic block, but this can be -left for later, when the port actually works. -This function is very simple, there is just a big switch on the instruction opcode -and in the corresponding case the functions or macros to emit the binary native code -are used. Note that in this function the lengths of the instructions are used to -determine if the buffer for the code needs enlarging. - -To complete the code emission for a method, a few other functions need -implementing as well: - - mono_arch_emit_prolog () - mono_arch_emit_epilog () - mono_arch_patch_code () - -mono_arch_emit_prolog () will emit the code to setup the stack frame for a method, -optionally call the callbacks used in profiling and tracing, and move the -arguments to their home location (in a caller-save register if the variable was -allocated to one, or in a stack location if the argument was passed in a volatile -register and wasn't allocated a non-volatile one). caller-save registers used by the -function are saved in the prolog as well. - -mono_arch_emit_epilog () will emit the code needed to return from the function, -optionally calling the profiling or tracing callbacks. At this point the basic blocks -or the code that was moved out of the normal flow for the function can be emitted -as well (this is usually done to provide better info for the static branch predictor). -In the epilog, caller-save registers are restored if they were used. -Note that, to help exception handling and stack unwinding, when there is a transition -from managed to unmanaged code, some special processing needs to be done (basically, -saving all the registers and setting up the links in the Last Managed Frame -structure). - -When the epilog has been emitted, the upper level code arranges for the buffer of -memory that contains the native code to be copied in an area of executable memory -and at this point, instructions that use relative addressing need to be patched -to have the right offsets: this work is done by mono_arch_patch_code (). + Since the first step in porting mono to a new CPU is to port + the interpreter, there should be already a file that allows + the emission of binary native code in a buffer for the + architecture. This file should be placed in the + + mono/arch/$(arch)/ + + directory. + + The bulk of the code emission happens in the mini-$(arch).c + file, in a function called mono_arch_output_basic_block + (). This function takes a basic block, walks the list of + instructions in the block and emits the binary code for each. + Optionally a peephole optimization pass is done on the basic + block, but this can be left for later, when the port actually + works. + + This function is very simple, there is just a big switch on + the instruction opcode and in the corresponding case the + functions or macros to emit the binary native code are + used. Note that in this function the lengths of the + instructions are used to determine if the buffer for the code + needs enlarging. + + To complete the code emission for a method, a few other + functions need implementing as well: + + mono_arch_emit_prolog () + mono_arch_emit_epilog () + mono_arch_patch_code () + + mono_arch_emit_prolog () will emit the code to setup the stack + frame for a method, optionally call the callbacks used in + profiling and tracing, and move the arguments to their home + location (in a caller-save register if the variable was + allocated to one, or in a stack location if the argument was + passed in a volatile register and wasn't allocated a + non-volatile one). caller-save registers used by the function + are saved in the prolog as well. + + mono_arch_emit_epilog () will emit the code needed to return + from the function, optionally calling the profiling or tracing + callbacks. At this point the basic blocks or the code that was + moved out of the normal flow for the function can be emitted + as well (this is usually done to provide better info for the + static branch predictor). In the epilog, caller-save + registers are restored if they were used. + + Note that, to help exception handling and stack unwinding, + when there is a transition from managed to unmanaged code, + some special processing needs to be done (basically, saving + all the registers and setting up the links in the Last Managed + Frame structure). + + When the epilog has been emitted, the upper level code + arranges for the buffer of memory that contains the native + code to be copied in an area of executable memory and at this + point, instructions that use relative addressing need to be + patched to have the right offsets: this work is done by + mono_arch_patch_code (). * Call conventions and register allocation -To account for the differences in the call conventions, a few functions need to -be implemented. - -mono_arch_allocate_vars () assigns to both arguments and local variables -the offset relative to the frame register where they are stored, dead -variables are simply discarded. The total amount of stack needed is calculated. - -mono_arch_call_opcode () is the function that more closely deals with the call -convention on a given system. For each argument to a function call, an instruction -is created that actually puts the argument where needed, be it the stack or a -specific register. This function can also re-arrange th order of evaluation -when multiple arguments are involved if needed (like, on x86 arguments are pushed -on the stack in reverse order). The function needs to carefully take into accounts -platform specific issues, like how structures are returned as well as the -differences in size and/or alignment of managed and corresponding unmanaged -structures. - -The other chunk of code that needs to deal with the call convention and other -specifics of a CPU, is the local register allocator, implemented in a function -named mono_arch_local_regalloc (). The local allocator deals with a basic block -at a time and basically just allocates registers for temporary -values during expression evaluation, spilling and unspilling as necessary. -The local allocator needs to take into account clobbering information, both -during simple instructions and during function calls and it needs to deal -with other architecture-specific weirdnesses, like instructions that take -inputs only in specific registers or output only is some. -Some effort will be put later in moving most of the local register allocator to -a common file so that the code can be shared more for similar, risc-like CPUs. -The register allocator does a first pass on the instructions in a block, collecting -liveness information and in a backward pass on the same list performs the -actual register allocation, inserting the instructions needed to spill values, -if necessary. - -When this part of code is implemented, some testing can be done with the generated -code for the new architecture. Most helpful is the use of the --regression -command line switch to run the regression tests (basic.cs, for example). -Note that the JIT will try to initialize the runtime, but it may not be able yet to -compile and execute complex code: commenting most of the code in the mini_init() -function in mini.c is needed to let the JIT just compile the regression tests. -Also, using multiple -v switches on the command line makes the JIT dump an -increasing amount of information during compilation. - - + To account for the differences in the call conventions, a few functions need to + be implemented. + + mono_arch_allocate_vars () assigns to both arguments and local + variables the offset relative to the frame register where they + are stored, dead variables are simply discarded. The total + amount of stack needed is calculated. + + mono_arch_call_opcode () is the function that more closely + deals with the call convention on a given system. For each + argument to a function call, an instruction is created that + actually puts the argument where needed, be it the stack or a + specific register. This function can also re-arrange th order + of evaluation when multiple arguments are involved if needed + (like, on x86 arguments are pushed on the stack in reverse + order). The function needs to carefully take into accounts + platform specific issues, like how structures are returned as + well as the differences in size and/or alignment of managed + and corresponding unmanaged structures. + + The other chunk of code that needs to deal with the call + convention and other specifics of a CPU, is the local register + allocator, implemented in a function named + mono_arch_local_regalloc (). The local allocator deals with a + basic block at a time and basically just allocates registers + for temporary values during expression evaluation, spilling + and unspilling as necessary. + + The local allocator needs to take into account clobbering + information, both during simple instructions and during + function calls and it needs to deal with other + architecture-specific weirdnesses, like instructions that take + inputs only in specific registers or output only is some. + + Some effort will be put later in moving most of the local + register allocator to a common file so that the code can be + shared more for similar, risc-like CPUs. The register + allocator does a first pass on the instructions in a block, + collecting liveness information and in a backward pass on the + same list performs the actual register allocation, inserting + the instructions needed to spill values, if necessary. + + When this part of code is implemented, some testing can be + done with the generated code for the new architecture. Most + helpful is the use of the --regression command line switch to + run the regression tests (basic.cs, for example). + + Note that the JIT will try to initialize the runtime, but it + may not be able yet to compile and execute complex code: + commenting most of the code in the mini_init() function in + mini.c is needed to let the JIT just compile the regression + tests. Also, using multiple -v switches on the command line + makes the JIT dump an increasing amount of information during + compilation. + + * Method trampolines -To get better startup performance, the JIT actually compiles a method only when -needed. To achieve this, when a call to a method is compiled, we actually emit a -call to a magic trampoline. The magic trampoline is a function written in assembly -that invokes the compiler to compile the given method and jumps to the newly compiled -code, ensuring the arguments it received are passed correctly to the actual method. -Before jumping to the new code, though, the magic trampoline takes care of patching -the call site so that next time the call will go directly to the method instead of the -trampoline. How does this all work? -mono_arch_create_jit_trampoline () creates a small function that just -preserves the arguments passed to it and adds an additional argument (the method -to compile) before calling the generic trampoline. This small function is called -the specific trampoline, because it is method-specific (the method to compile -is hard-code in the instruction stream). -The generic trampoline saves all the arguments that could get clobbered -and calls a C function that will do two things: - -*) actually call the JIT to compile the method -*) identify the calling code so that it can be patched to call directly -the actual method - -If the 'this' argument to a method is a boxed valuetype that is passed to -a method that expects just a pointer to the data, an additional unboxing -trampoline will need to be inserted as well. - + To get better startup performance, the JIT actually compiles a + method only when needed. To achieve this, when a call to a + method is compiled, we actually emit a call to a magic + trampoline. The magic trampoline is a function written in + assembly that invokes the compiler to compile the given method + and jumps to the newly compiled code, ensuring the arguments + it received are passed correctly to the actual method. + + Before jumping to the new code, though, the magic trampoline + takes care of patching the call site so that next time the + call will go directly to the method instead of the + trampoline. How does this all work? + + mono_arch_create_jit_trampoline () creates a small function + that just preserves the arguments passed to it and adds an + additional argument (the method to compile) before calling the + generic trampoline. This small function is called the specific + trampoline, because it is method-specific (the method to + compile is hard-code in the instruction stream). + + The generic trampoline saves all the arguments that could get + clobbered and calls a C function that will do two things: + + *) actually call the JIT to compile the method + *) identify the calling code so that it can be patched to call directly + the actual method + + If the 'this' argument to a method is a boxed valuetype that + is passed to a method that expects just a pointer to the data, + an additional unboxing trampoline will need to be inserted as + well. + * Exception handling -Exception handling is likely the most difficult part of the port, as it needs -to deal with unwinding (both managed and unmanaged code) and calling -catch and filter blocks. It also needs to deal with signals, because mono -takes advantage of the MMU in the CPU and of the operation system to -handle dereferences of the NULL pointer. Some of the function needed -to implement the mechanisms are: - -mono_arch_get_throw_exception () returns a function that takes an exception object -and invokes an arch-specific function that will enter the exception processing. -To do so, all the relevant registers need to be saved and passed on. - -mono_arch_handle_exception () this function takes the exception thrown and -a context that describes the state of the CPU at the time the exception was -thrown. The function needs to implement the exception handling mechanism, -so it makes a search for an handler for the exception and if none is found, -it follows the unhandled exception path (that can print a trace and exit or -just abort the current thread). The difficulty here is to unwind the stack -correctly, by restoring the register state at each call site in the call chain, -calling finally, filters and handler blocks while doing so. - -As part of exception handling a couple of internal calls need to be implemented -as well. -ves_icall_get_frame_info () returns info about a specific frame. -mono_jit_walk_stack () walks the stack and calls a callback with info for -each frame found. -ves_icall_get_trace () return an array of StackFrame objects. - + Exception handling is likely the most difficult part of the + port, as it needs to deal with unwinding (both managed and + unmanaged code) and calling catch and filter blocks. It also + needs to deal with signals, because mono takes advantage of + the MMU in the CPU and of the operation system to handle + dereferences of the NULL pointer. Some of the function needed + to implement the mechanisms are: + + mono_arch_get_throw_exception () returns a function that takes + an exception object and invokes an arch-specific function that + will enter the exception processing. To do so, all the + relevant registers need to be saved and passed on. + + mono_arch_handle_exception () this function takes the + exception thrown and a context that describes the state of the + CPU at the time the exception was thrown. The function needs + to implement the exception handling mechanism, so it makes a + search for an handler for the exception and if none is found, + it follows the unhandled exception path (that can print a + trace and exit or just abort the current thread). The + difficulty here is to unwind the stack correctly, by restoring + the register state at each call site in the call chain, + calling finally, filters and handler blocks while doing so. + + As part of exception handling a couple of internal calls need + to be implemented as well. + + ves_icall_get_frame_info () returns info about a specific + frame. + + mono_jit_walk_stack () walks the stack and calls a callback with info for + each frame found. + + ves_icall_get_trace () return an array of StackFrame objects. + ** Code generation for filter/finally handlers -Filter and finally handlers are called from 2 different locations: - - 1.) from within the method containing the exception clauses - 2.) from the stack unwinding code - -To make this possible we implement them like subroutines, ending with a -"return" statement. The subroutine does not save the base pointer, because we -need access to the local variables of the enclosing method. Its is possible -that instructions inside those handlers modify the stack pointer, thus we save -the stack pointer at the start of the handler, and restore it at the end. We -have to use a "call" instruction to execute such finally handlers. Filters -receives the exception object inside a register (ECX on x86). - -The MIR code for filter and finally handlers looks like: - - OP_START_HANDLER - ... - OP_END_FINALLY | OP_ENDFILTER(reg) - -OP_START_HANDLER: should save the stack pointer somewhere -OP_END_FINALLY: restores the stack pointers and returns. -OP_ENDFILTER (reg): restores the stack pointers and returns the value in "reg". - + Filter and finally handlers are called from 2 different locations: + + 1.) from within the method containing the exception clauses + 2.) from the stack unwinding code + + To make this possible we implement them like subroutines, + ending with a "return" statement. The subroutine does not save + the base pointer, because we need access to the local + variables of the enclosing method. Its is possible that + instructions inside those handlers modify the stack pointer, + thus we save the stack pointer at the start of the handler, + and restore it at the end. We have to use a "call" instruction + to execute such finally handlers. + + The MIR code for filter and finally handlers looks like: + + OP_START_HANDLER + ... + OP_END_FINALLY | OP_ENDFILTER(reg) + + OP_START_HANDLER: should save the stack pointer somewhere + OP_END_FINALLY: restores the stack pointers and returns. + OP_ENDFILTER (reg): restores the stack pointers and returns the value in "reg". + ** Calling finally/filter handlers -There is a special opcode to call those handler, its called OP_CALL_HANDLER. It -simple emits a call instruction. - -Its a bit more complex to call handler from outside (in the stack unwinding -code), because we have to restore the whole context of the method first. After that -we simply emit a call instruction to invoke the handler. Its usually -possible to use the same code to call filter and finally handlers (see -arch_get_call_filter). - + There is a special opcode to call those handler, its called + OP_CALL_HANDLER. It simple emits a call instruction. + + Its a bit more complex to call handler from outside (in the + stack unwinding code), because we have to restore the whole + context of the method first. After that we simply emit a call + instruction to invoke the handler. Its usually possible to use + the same code to call filter and finally handlers (see + arch_get_call_filter). + +** Calling catch handlers + + Catch handlers are always called from the stack unwinding + code. Unlike finally clauses or filters, catch handler never + return. Instead we simply restore the whole context, and + restart execution at the catch handler. + +** Passing Exception objects to catch handlers and filters. + + We use a local variable to store exception objects. The stack + unwinding code must store the exception object into this + variable before calling catch handler or filter. + * Minor helper methods -A few minor helper methods are referenced from the arch-independent code. -Some of them are: - -*) mono_arch_cpu_optimizations () - This function returns a mask of optimizations that should be enabled for the - current CPU and a mask of optimizations that should be excluded, instead. - -*) mono_arch_regname () - Returns the name for a numeric register. - -*) mono_arch_get_allocatable_int_vars () - Returns a list of variables that can be allocated to the integer registers - in the current architecture. - -*) mono_arch_get_global_int_regs () - Returns a list of caller-save registers that can be used to allocate variables - in the current method. - -*) mono_arch_instrument_mem_needs () -*) mono_arch_instrument_prolog () -*) mono_arch_instrument_epilog () - Functions needed to implement the profiling interface. - - + A few minor helper methods are referenced from the arch-independent code. + Some of them are: + + *) mono_arch_cpu_optimizations () + This function returns a mask of optimizations that + should be enabled for the current CPU and a mask of + optimizations that should be excluded, instead. + + *) mono_arch_regname () + Returns the name for a numeric register. + + *) mono_arch_get_allocatable_int_vars () + Returns a list of variables that can be allocated to + the integer registers in the current architecture. + + *) mono_arch_get_global_int_regs () + Returns a list of caller-save registers that can be + used to allocate variables in the current method. + + *) mono_arch_instrument_mem_needs () + *) mono_arch_instrument_prolog () + *) mono_arch_instrument_epilog () + Functions needed to implement the profiling interface. + + * Writing regression tests -Regression tests for the JIT should be written for any bug found in the JIT -in one of the *.cs files in the mini directory. Eventually all the operations -of the JIT should be tested (including the ones that get selected only when -some specific optimization is enabled). - + Regression tests for the JIT should be written for any bug + found in the JIT in one of the *.cs files in the mini + directory. Eventually all the operations of the JIT should be + tested (including the ones that get selected only when some + specific optimization is enabled). + * Platform specific optimizations -An example of a platform-specific optimization is the peephole optimization: -we look at a small window of code at a time and we replace one or more -instructions with others that perform better for the given architecture or CPU. - + An example of a platform-specific optimization is the peephole + optimization: we look at a small window of code at a time and + we replace one or more instructions with others that perform + better for the given architecture or CPU. +