X-Git-Url: http://wien.tomnetworks.com/gitweb/?a=blobdiff_plain;f=docs%2Fprecise-gc;h=cf0f733639d1646d2ceaad65e2d98107bcc1a817;hb=43e9ddc1fcddb632fc1e661495e468a2ebdf6fec;hp=8f84ea3203bb6167d52dd33c58314caa37def702;hpb=6b6435d1b3206b0162c37e5ecce8d9a699fe6467;p=mono.git diff --git a/docs/precise-gc b/docs/precise-gc index 8f84ea3203b..cf0f733639d 100644 --- a/docs/precise-gc +++ b/docs/precise-gc @@ -3,75 +3,95 @@ precise, generational GC for better performance and smaller memory usage (no false-positives memory retentions with big allocations). -This is a large task, but it can be done in steps. - -1) use the GCJ support to mark reference fields in objects, so -scanning the heap is faster. This is mostly done already, needs -checking that it is always used correctly (big objects, arrays). -There are also some data structures we use in the runtime that are -currently untyped that are allocated in the Gc heap and used to -keep references to GC objects. We need to make them typed as to -precisely track GC references or make them non-GC memory, -by using more the GC hnadle support code (MonoGHashTable, MonoDomain, -etc). - -2) don't include in the static roots the .bss and .data segments -to save in scanning time and limit false-positives. This is mostly -done already. - -3) keep track precisely of stack locations and registers in native -code generation. This basically requires the regalloc rewrite code -first, if we don't want to duplicate much of it. This is the hardest -task of all, since proving it's correctness is very hard. Some tricks, -like having a build that injects GC.Collect() after every few simple -operations may help. We also need to decide if we want to handle safe -points at calls and back jumps only or at every instruction. The latter -case is harder to implement and requires we keep around much more data -(it potentially makes for faster stop-the-world phases). -The first case requires us to be able to advance a thread until it -reaches the next safe point: this can be done with the same techniques -used by a debugger. We already need something like this to handle -safely aborts happening in the middle of a prolog in managed code, -for example, so this could be an additional sub-task that can be done -separately from the GC work. -Note that we can adapt the libgc code to use the info we collect -when scanning the stack in managed methods and still use the conservative -approach for the unmanaged stack, until we have our own collector, -which requires we define a proper icall interface to switch from managed -to unmanaged code (hwo to we handle object references in the icall -implementations, for example). - -4) we could make use of the generational capabilities of the -Boehm GC, but not with the current method involving signals which -may create incompatibilities and is not supported on all platforms. -We need to start using write barriers: they will be required anyway -for the generational GC we'll use. When a field holding a reference -is changed in an object (or an item in an array), we mark the card -or page where the field is stored as dirty. Later, when a collection -is run, only objects in pages marked as dirty are scanned for -references instead of the whole heap. This could take a few days to -implement and probably much more time to debug if all the cases were -not catched:-) - -5) actually write the new generational and precise collector. There are -several examples out there as open source projects, though the CLR -needs some specific semantics so the code needs to be written from -scratch anyway. Compared to item 3 this is relatively easer and it can -be tested outside of mono, too, until mono is ready to use it. -The important features needed: -*) precise, so there is no false positive memory retention -*) generational to reduce collection times -*) pointer-hopping allocation to reduce alloc time -*) possibly per-thread lock-free allocation -*) handle weakrefs and finalizers with the CLR semantics - -Note: some GC engines use a single mmap area, because it makes -handling generations and the implementation much easier, but this also -limits the expension of the heap, so people may need to use a command-line -option to set the max heap size etc. It would be better to have a design -that allows mmapping a few megabytes chunks at a time. - -The different tasks can be done in parallel. 1, 2 and 4 can be done in time -for the mono 1.2 release. Parts of 3 and 5 could be done as well. -The complete switch is supposed to happen with the mono 2.0 release. +The first working implementation is committed in metadata/sgen-gc.c +as of May, 2006. This is a two-generations moving collector and it is +currently used to shake out all the issues in the runtime that need to +be fixed in order to support precise generational and moving collectors. + +The two main issues are: +1) identify as precisely as possible all the pointers to managed objects +2) insert write barriers to be able to account for pointers in the old +generation pointing to objects in the newer generations + +Point 1 is mostly complete. The runtime can register additional roots +with the GC as needed and it provides to the GC precise info on the +objects layout. In particular with the new precise GC it is not possible to +store GC-allocated memory in IntPtr or UIntPtr fields (in fact, the new GC +can allocate only objects and not GC-tracked untyped blobs of memory +as the Boehm GC can do). Precise info is tracked also for static fields. +What is currently missing here is: +*) precise info for ThreadStatic and ContextStatic storage (this also requires +better memory management for these sub-heaps) +*) precise info for HANDLE_NORMAL gc handles +*) precise info for thread stacks (this requires storing the info about +managed stack frames along with the jitted code for a method and doing the +stack walk for the active threads, considering conservatively the unmanaged +stack frames and precisely the managed ones. mono_jit_info_table_find () must +be made lock-free for this to work). Precise type info must be maintained +for all the local variables. Precise type info should be maintained also +for registers. +Note that this is not a correctness issue, but a performance one. The more +pointers to objects we can deal with precisely, the more effective the GC +will be, since it will be able to move the objects. The first two todo items +are mostly trivial, while handling precisely the thread stacks is complex to +implement and to test and it has a cpu and memory use runtime penalty. +In practice we need to be able to describe to the GC _all_ the memory +locations that can hold a pointer to a managed object and we must tell it also +if that location can contain: +*) a pointer to the start of an object or NULL (typically a field of an object) +*) a pinning pointer to an object (typically the result of the fixed statment in C#) +*) a pointer to the managed heap or to other locations (a typical stack location) +Since we need to provide to the GC all the locations it's not possible anymore to +store any object in unmanaged memory if it is not explicitly pinned for the entire +time the object is stored there. With the Boehm GC this was possible if the object +was kept alive in some way, but with the new GC it is not valid anymore, because +objects can move: the object will be kept alive because of the other reference, but the +pointer in unmanaged memory won't be updated to the new location where the object +has been moved. + +Most of the work for inserting write barrier calls is already done as well, +but there may be still bugs in this area. In particular for it to work, +the correct IL opcodes must be used when storing an object in a field or +array element (most of the marshal.c code needs to be reviewed to use +stind.ref instead of stind.i/stind.u when needed). When this is done, the +JIT will take care of automatically inserting the write barriers. +What the JIT does automatically for managed code, must be done manually +in the runtime C code that deals with storing fields in objects and arrays +or otherwise any operation that could change a pointer in the old generation +to point to an object in the new generation. Sample cases are as follows: + +*) when using C structs that map to managed objects the following macro +must be used to store an object in a field (the macro must not be used +when storing non-objects and it should not be used when storing NULL values): + + MONO_OBJECT_SETREF(obj,fieldname,value) +where obj is the pointer to the object, fieldname is the name of the field in +the C struct and value is a MonoObject*. Note that obj must be a correctly +typed pointer to a struct that embeds MonoObject as the first field and +have fieldname as a field. + +*) when setting the element of an array of references to an object, use the +following macro: + + mono_array_setref (array,index,value) + +*) when copying a number of references from an array to another: + + mono_array_memcpy_refs (dest,destidx,src,srcidx,count) + +*) when copying a struct that may containe reference fields, use: + + void mono_value_copy (gpointer dest, gpointer src, MonoClass *klass) + +*) when it is unknown if a pointer points to the stack or to the heap and an +object needs to be stored through it, use: + + void mono_gc_wbarrier_generic_store (gpointer ptr, MonoObject* value) + +Note that the support for write barriers in the runtime could be +used to enable also the generational features of the Boehm GC. + +Some more documentation on the new GC is available at: +http://www.mono-project.com/Compacting_GC +