Merge pull request #5714 from alexischr/update_bockbuild

[mono.git] / docs / precise-gc
diff --git a/docs/precise-gc b/docs/precise-gc

index f0dd52e5792c77a6c118aa97f90871c9fce8ff70..cf0f733639d1646d2ceaad65e2d98107bcc1a817 100644 (file)
--- a/docs/precise-gc
+++ b/docs/precise-gc
@@ -3,63 +3,95 @@ precise, generational GC for better performance and smaller
  memory usage (no false-positives memory retentions with big
  allocations).
  
-This is a large task, but it can be done in steps.
-
-1) use the GCJ support to mark reference fields in objects, so
-scanning the heap is faster. This is mostly done already, needs
-checking that it is always used correctly (big objects, arrays).
-
-2) don't include in the static roots the .bss and .data segments
-to save in scanning time and limit false-positives. This is mostly 
-done already.
-
-3) keep track precisely of stack locations and registers in native
-code generation. This basically requires the regalloc rewrite code 
-first, if we don't want to duplicate much of it. This is the hardest 
-task of all, since proving it's correctness is very hard. Some tricks,
-like having a build that injects GC.Collect() after every few simple 
-operations may help. We also need to decide if we want to handle safe 
-points at calls and back jumps only or at every instruction. The latter
-case is harder to implement and requires we keep around much more data
-(it potentially makes for faster stop-the-world phases).
-The first case requires us to be able to advance a thread until it 
-reaches the next safe point: this can be done with the same techniques 
-used by a debugger. We already need something like this to handle
-safely aborts happening in the middle of a prolog in managed code, 
-for example, so this could be an additional sub-task that can be done
-separately from the GC work.
-Note that we can adapt the libgc code to use the info we collect
-when scanning the stack in managed methods and still use the conservative
-approach for the unmanaged stack, until we have our own collector,
-which requires we define a proper icall interface to switch from managed 
-to unmanaged code (hwo to we handle object references in the icall 
-implementations, for example).
-
-4) we could make use of the generational capabilities of the 
-Boehm GC, but not with the current method involving signals which
-may create incompatibilities and is not supported on all platforms.
-We need to start using write barriers: they will be required anyway
-for the generational GC we'll use. When a field holding a reference
-is changed in an object (or an item in an array), we mark the card
-or page where the field is stored as dirty. Later, when a collection 
-is run, only objects in pages marked as dirty are scanned for
-references instead of the whole heap. This could take a few days to
-implement and probably much more time to debug if all the cases were 
-not catched:-)
-
-5) actually write the new generational and precise collector. There are
-several examples out there as open source projects, though the CLR
-needs some specific semantics so the code needs to be written from 
-scratch anyway. Compared to item 3 this is relatively easer and it can
-be tested outside of mono, too, until mono is ready to use it.
-The important features needed:
-*) precise, so there is no false positive memory retention
-*) generational to reduce collection times
-*) pointer-hopping allocation to reduce alloc time
-*) possibly per-thread lock-free allocation
-*) handle weakrefs and finalizers with the CLR semantics
-
-The different tasks can be done in parallel. 1, 2 and 4 can be done in time
-for the mono 1.2 release. Parts of 3 and 5 could be done as well.
-The complete switch is supposed to happen with the mono 2.0 release.
+The first working implementation is committed in metadata/sgen-gc.c
+as of May, 2006. This is a two-generations moving collector and it is
+currently used to shake out all the issues in the runtime that need to
+be fixed in order to support precise generational and moving collectors.
+
+The two main issues are:
+1) identify as precisely as possible all the pointers to managed objects
+2) insert write barriers to be able to account for pointers in the old
+generation pointing to objects in the newer generations
+
+Point 1 is mostly complete. The runtime can register additional roots
+with the GC as needed and it provides to the GC precise info on the
+objects layout. In particular with the new precise GC it is not possible to
+store GC-allocated memory in IntPtr or UIntPtr fields (in fact, the new GC
+can allocate only objects and not GC-tracked untyped blobs of memory
+as the Boehm GC can do). Precise info is tracked also for static fields.
+What is currently missing here is:
+*) precise info for ThreadStatic and ContextStatic storage (this also requires
+better memory management for these sub-heaps)
+*) precise info for HANDLE_NORMAL gc handles
+*) precise info for thread stacks (this requires storing the info about
+managed stack frames along with the jitted code for a method and doing the
+stack walk for the active threads, considering conservatively the unmanaged
+stack frames and precisely the managed ones. mono_jit_info_table_find () must
+be made lock-free for this to work). Precise type info must be maintained
+for all the local variables. Precise type info should be maintained also
+for registers.
+Note that this is not a correctness issue, but a performance one. The more
+pointers to objects we can deal with precisely, the more effective the GC
+will be, since it will be able to move the objects. The first two todo items
+are mostly trivial, while handling precisely the thread stacks is complex to
+implement and to test and it has a cpu and memory use runtime penalty.
+In practice we need to be able to describe to the GC _all_ the memory
+locations that can hold a pointer to a managed object and we must tell it also
+if that location can contain:
+*) a pointer to the start of an object or NULL (typically a field of an object)
+*) a pinning pointer to an object (typically the result of the fixed statment in C#)
+*) a pointer to the managed heap or to other locations (a typical stack location)
+Since we need to provide to the GC all the locations it's not possible anymore to
+store any object in unmanaged memory if it is not explicitly pinned for the entire
+time the object is stored there. With the Boehm GC this was possible if the object
+was kept alive in some way, but with the new GC it is not valid anymore, because
+objects can move: the object will be kept alive because of the other reference, but the
+pointer in unmanaged memory won't be updated to the new location where the object
+has been moved.
+
+Most of the work for inserting write barrier calls is already done as well,
+but there may be still bugs in this area. In particular for it to work,
+the correct IL opcodes must be used when storing an object in a field or
+array element (most of the marshal.c code needs to be reviewed to use 
+stind.ref instead of stind.i/stind.u when needed). When this is done, the
+JIT will take care of automatically inserting the write barriers.
+What the JIT does automatically for managed code, must be done manually
+in the runtime C code that deals with storing fields in objects and arrays
+or otherwise any operation that could change a pointer in the old generation
+to point to an object in the new generation. Sample cases are as follows:
+
+*) when using C structs that map to managed objects the following macro
+must be used to store an object in a field (the macro must not be used
+when storing non-objects and it should not be used when storing NULL values):
+
+       MONO_OBJECT_SETREF(obj,fieldname,value)
+where obj is the pointer to the object, fieldname is the name of the field in
+the C struct and value is a MonoObject*. Note that obj must be a correctly
+typed pointer to a struct that embeds MonoObject as the first field and
+have fieldname as a field.
+
+*) when setting the element of an array of references to an object, use the
+following macro:
+
+       mono_array_setref (array,index,value)
+
+*) when copying a number of references from an array to another:
+
+       mono_array_memcpy_refs (dest,destidx,src,srcidx,count)
+
+*) when copying a struct that may containe reference fields, use:
+
+       void mono_value_copy (gpointer dest, gpointer src, MonoClass *klass)
+
+*) when it is unknown if a pointer points to the stack or to the heap and an
+object needs to be stored through it, use:
+
+       void mono_gc_wbarrier_generic_store (gpointer ptr, MonoObject* value)
+
+Note that the support for write barriers in the runtime could be
+used to enable also the generational features of the Boehm GC.
+
+Some more documentation on the new GC is available at:
+http://www.mono-project.com/Compacting_GC
+