X-Git-Url: http://wien.tomnetworks.com/gitweb/?a=blobdiff_plain;f=docs%2Faot-compiler.txt;h=22a868099cbf003664849f79367db2a81fbac3cf;hb=6470311b57eabe10537b670d9bd49e3f2935e050;hp=ab1af90d96584d1eafae6b96adb883570a185724;hpb=0abc2e6270020edc4a5b4c66f93b4ae582815f20;p=mono.git diff --git a/docs/aot-compiler.txt b/docs/aot-compiler.txt index ab1af90d965..22a868099cb 100644 --- a/docs/aot-compiler.txt +++ b/docs/aot-compiler.txt @@ -1,44 +1,173 @@ Mono Ahead Of Time Compiler =========================== -The new mono JIT has sophisticated optimization features. It uses SSA and has a -pluggable architecture for further optimizations. This makes it possible and -efficient to use the JIT also for AOT compilation. + The Ahead of Time compilation feature in Mono allows Mono to + precompile assemblies to minimize JIT time, reduce memory + usage at runtime and increase the code sharing across multiple + running Mono application. + To precompile an assembly use the following command: + + mono --aot -O=all assembly.exe -* file format: We use the native object format of the platform. That way it is - possible to reuse existing tools like objdump and the dynamic loader. All we - need is a working assembler, i.e. we write out a text file which is then - passed to gas (the gnu assembler) to generate the object file. + The `--aot' flag instructs Mono to ahead-of-time compile your + assembly, while the -O=all flag instructs Mono to use all the + available optimizations. -* file names: we simply add ".so" to the generated file. For example: - basic.exe -> basic.exe.so - corlib.dll -> corlib.dll.so +* Position Independent Code +--------------------------- -* staring the AOT compiler: mini --aot assembly_name + On x86 and x86-64 the code generated by Ahead-of-Time compiled + images is position-independent code. This allows the same + precompiled image to be reused across multiple applications + without having different copies: this is the same way in which + ELF shared libraries work: the code produced can be relocated + to any address. -The following things are saved in the object file: + The implementation of Position Independent Code had a + performance impact on Ahead-of-Time compiled images but + compiler bootstraps are still faster than JIT-compiled images, + specially with all the new optimizations provided by the Mono + engine. -* version infos: +* How to support Position Independent Code in new Mono Ports +------------------------------------------------------------ -* native code: this is labeled with method_XXXXXXXX: where XXXXXXXX is the - hexadecimal token number of the method. + Generated native code needs to reference various runtime + structures/functions whose address is only known at run + time. JITted code can simple embed the address into the native + code, but AOT code needs to do an indirection. This + indirection is done through a table called the Global Offset + Table (GOT), which is similar to the GOT table in the Elf + spec. When the runtime saves the AOT image, it saves some + information for each method describing the GOT table entries + used by that method. When loading a method from an AOT image, + the runtime will fill out the GOT entries needed by the + method. -* additional informations needed by the runtime: For example we need to store - the code length and the exception tables. We also need a way to patch - constants only available at runtime (for example vtable and class - addresses). This is stored i a binary blob labeled method_info_XXXXXXXX: + * Computing the address of the GOT -PROBLEMS: + Methods which need to access the GOT first need to compute its + address. On the x86 it is done by code like this: + + call + pop ebx + add , ebx + + + The variable representing the got is stored in + cfg->got_var. It is allways allocated to a global register to + prevent some problems with branches + basic blocks. + + * Referencing GOT entries + + Any time the native code needs to access some other runtime + structure/function (i.e. any time the backend calls + mono_add_patch_info ()), the code pointed by the patch needs + to load the value from the got. For example, instead of: + + call + it needs to do: + call *() + + Here, the can be 0, it will be fixed up by the AOT compiler. + + For more examples on the changes required, see + + svn diff -r 37739:38213 mini-x86.c + +* The Precompiled File Format +----------------------------- + + We use the native object format of the platform. That way it + is possible to reuse existing tools like objdump and the + dynamic loader. All we need is a working assembler, i.e. we + write out a text file which is then passed to gas (the gnu + assembler) to generate the object file. + + The precompiled image is stored in a file next to the original + assembly that is precompiled with the native extension for a shared + library (on Linux its ".so" to the generated file). + + For example: basic.exe -> basic.exe.so; corlib.dll -> corlib.dll.so + + The following things are saved in the object file and can be + looked up using the equivalent to dlsym: + + mono_assembly_guid + + A copy of the assembly GUID. + + mono_aot_version + + The format of the AOT file format. + + mono_aot_opt_flags + + The optimizations flags used to build this + precompiled image. + + method_infos + + Contains additional information needed by the runtime for using the + precompiled method, like the GOT entries it uses. + + method_info_offsets + + Maps method indexes to offsets in the method_infos array. + + mono_icall_table + + A table that lists all the internal calls + references by the precompiled image. + + mono_image_table + + A list of assemblies referenced by this AOT + module. + + method_offsets + + The equivalent to a procedure linkage table. + +* Performance considerations +---------------------------- + +Using AOT code is a trade-off which might lead to higher or slower performance, +depending on a lot of circumstances. Some of these are: + +- AOT code needs to be loaded from disk before being used, so cold startup of + an application using AOT code MIGHT be slower than using JITed code. Warm + startup (when the code is already in the machines cache) should be faster. + Also, JITing code takes time, and the JIT compiler also need to load + additional metadata for the method from the disk, so startup can be faster + even in the cold startup case. +- AOT code is usually compiled with all optimizations turned on, while JITted + code is usually compiled with default optimizations, so the generated code + in the AOT case should be faster. +- JITted code can directly access runtime data structures and helper functions, + while AOT code needs to go through an indirection (the GOT) to access them, + so it will be slower and somewhat bigger as well. +- When JITting code, the JIT compiler needs to load a lot of metadata about + methods and types into memory. +- JITted code has better locality, meaning that if A method calls B, then + the native code for A and B is usually quite close in memory, leading to + better cache behaviour thus improved performance. In contrast, the native + code of methods inside the AOT file is in a somewhat random order. + +* Future Work +------------- + +- Currently, the runtime needs to setup some data structures and fill out + GOT entries before a method is first called. This means that even calls to + a method whose code is in the same AOT image need to go through the GOT, + instead of using a direct call. +- On x86, the generated code uses call 0, pop REG, add GOTOFFSET, REG to + materialize the GOT address. Newer versions of gcc use a separate function + to do this, maybe we need to do the same. +- Currently, we get vtable addresses from the GOT. Another solution would be + to store the data from the vtables in the .bss section, so accessing them + would involve less indirection. -- all precompiled methods must be domain independent, or we add patch infos to - patch the target doamin. -- the main problem is how to patch runtime related addresses, for example: - - current application domain - - string objects loaded with LDSTR - - address of MonoClass data - - static field offsets - - method addreses - - virtual function and interface slots