While the Boehm GC is quite good, we need to move to a precise, generational GC for better performance and smaller memory usage (no false-positives memory retentions with big allocations). The first working implementation is committed in metadata/sgen-gc.c as of May, 2006. This is a two-generations moving collector and it is currently used to shake out all the issues in the runtime that need to be fixed in order to support precise generational and moving collectors. The two main issues are: 1) identify as precisely as possible all the pointers to managed objects 2) insert write barriers to be able to account for pointers in the old generation pointing to objects in the newer generations Point 1 is mostly complete. The runtime can register additional roots with the GC as needed and it provides to the GC precise info on the objects layout. In particular with the new precise GC it is not possible to store GC-allocated memory in IntPtr or UIntPtr fields (in fact, the new GC can allocate only objects and not GC-tracked untyped blobs of memory as the Boehm GC can do). Precise info is tracked also for static fields. What is currently missing here is: *) precise info for ThreadStatic and ContextStatic storage (this also requires better memory management for these sub-heaps) *) precise info for HANDLE_NORMAL gc handles *) precise info for thread stacks (this requires storing the info about managed stack frames along with the jitted code for a method and doing the stack walk for the active threads, considering conservatively the unmanaged stack frames and precisely the managed ones. mono_jit_info_table_find () must be made lock-free for this to work). Precise type info must be maintained for all the local variables. Precise type info should be maintained also for registers. Note that this is not a correctness issue, but a performance one. The more pointers to objects we can deal with precisely, the more effective the GC will be, since it will be able to move the objects. The first two todo items are mostly trivial, while handling precisely the thread stacks is complex to implement and to test and it has a cpu and memory use runtime penalty. In practice we need to be able to describe to the GC _all_ the memory locations that can hold a pointer to a managed object and we must tell it also if that location can contain: *) a pointer to the start of an object or NULL (typically a field of an object) *) a pinning pointer to an object (typically the result of the fixed statment in C#) *) a pointer to the managed heap or to other locations (a typical stack location) Since we need to provide to the GC all the locations it's not possible anymore to store any object in unmanaged memory if it is not explicitly pinned for the entire time the object is stored there. With the Boehm GC this was possible if the object was kept alive in some way, but with the new GC it is not valid anymore, because objects can move: the object will be kept alive because of the other reference, but the pointer in unmanaged memory won't be updated to the new location where the object has been moved. Most of the work for inserting write barrier calls is already done as well, but there may be still bugs in this area. In particular for it to work, the correct IL opcodes must be used when storing an object in a field or array element (most of the marshal.c code needs to be reviewed to use stind.ref instead of stind.i/stind.u when needed). When this is done, the JIT will take care of automatically inserting the write barriers. What the JIT does automatically for managed code, must be done manually in the runtime C code that deals with storing fields in objects and arrays or otherwise any operation that could change a pointer in the old generation to point to an object in the new generation. Sample cases are as follows: *) when using C structs that map to managed objects the following macro must be used to store an object in a field (the macro must not be used when storing non-objects and it should not be used when storing NULL values): MONO_OBJECT_SETREF(obj,fieldname,value) where obj is the pointer to the object, fieldname is the name of the field in the C struct and value is a MonoObject*. Note that obj must be a correctly typed pointer to a struct that embeds MonoObject as the first field and have fieldname as a field. *) when setting the element of an array of references to an object, use the following macro: mono_array_setref (array,index,value) *) when copying a number of references from an array to another: mono_array_memcpy_refs (dest,destidx,src,srcidx,count) *) when copying a struct that may containe reference fields, use: void mono_value_copy (gpointer dest, gpointer src, MonoClass *klass) *) when it is unknown if a pointer points to the stack or to the heap and an object needs to be stored through it, use: void mono_gc_wbarrier_generic_store (gpointer ptr, MonoObject* value) Note that the support for write barriers in the runtime could be used to enable also the generational features of the Boehm GC. Some more documentation on the new GC is available at: http://www.mono-project.com/Compacting_GC