X-Git-Url: http://wien.tomnetworks.com/gitweb/?a=blobdiff_plain;f=docs%2Fexception-handling.txt;h=1fae4e4e1b70c4830c251946430c8e5acfbebc41;hb=831403d104705c81d2fdb98473855da2e3076311;hp=1323020ee6bc11df03ab59166b124f66f8b5785b;hpb=496dfbf9ec0fd3143e5dd560a863d916e56a52b8;p=mono.git diff --git a/docs/exception-handling.txt b/docs/exception-handling.txt index 1323020ee6b..1fae4e4e1b7 100644 --- a/docs/exception-handling.txt +++ b/docs/exception-handling.txt @@ -2,252 +2,329 @@ Exception Handling In the Mono Runtime -------------------------------------- -Introduction ------------- - - There are many types of exceptions which the runtime needs to handle. These -are: -- exceptions thrown from managed code using the 'throw' or 'rethrow' CIL - instructions. -- exceptions thrown by some IL instructions like InvalidCastException thrown - by the 'castclass' CIL instruction. -- exceptions thrown by runtime code -- synchronous signals received while in managed code -- synchronous signals received while in native code -- asynchronous signals - -Since exception handling is very arch dependent, parts of the exception -handling code reside in the arch specific exceptions-.c files. The -architecture independent parts are in mini-exceptions.c. The different -exception types listed above are generated in different parts of the runtime, -but ultimately, they all end up in the mono_handle_exception () function in -mini-exceptions.c. - -Exceptions throw programmatically from managed code ---------------------------------------------------- - -These exceptions are thrown from managed code using 'throw' or 'rethrow' CIL -instructions. The JIT compiler will translate them to a call to a helper -function called 'mono_arch_throw/rethrow_exception'. These helper functions do -not exist at compile time, they are created dynamically at run time by the -code in the exceptions-.c files. They perform various stack -manipulation magic, then call a helper function usually named throw_exception (), which -does further processing in C code, then calls mono_handle_exception () to do the rest. - -Exceptions thrown implicitly from managed code ----------------------------------------------- - -These exceptions are thrown by some IL instructions when something goes wrong. -When the JIT needs to throw such an exception, it emits a forward conditional -branch and remembers its position, along with the exception which needs to -be emitted. This is usually done in macros named EMIT_COND_SYSTEM_EXCEPTION in -the mini-.c files. After the machine code for the method is emitted, the -JIT calls the arch dependent mono_arch_emit_exceptions () function which will -add the exception throwing code to the end of the method, and patches up the -previous forward branches so they will point to this code. This has the -advantage that the rarely-executed exception throwing code is kept separate -from the method body, leading to better icache performance. -The exception throwing code braches to the dynamically generated -mono_arch_throw_corlib_exception helper function, which will create the -proper exception object, does some stack manipulation, then calls -throw_exception (). - -Exceptions thrown by runtime code ---------------------------------- - -These exceptions are usually thrown by the implementations of InternalCalls -(icalls). First an appropriate exception object is created with the help of -various helper functions in metadata/exception.c, which has a separate helper -function for allocating each kind of exception object used by the runtime code. -Then the mono_raise_exception () function is called to actually throw the -exception. That function never returns. - -An example: - if (something_is_wrong) - mono_raise_exception (mono_get_exception_index_out_of_range ()); - -mono_raise_exception () simply passes the exception to the JIT side through -an API, where it will be received by helper created by mono_arch_throw_exception (). From now on, it is treated as an exception thrown from managed code. - -Synchronous signals -------------------- - -For performance reasons, the runtime does not do same checks required by the -CLI spec. Instead, it relies on the CPU to do them. The two main checks which -are omitted are null-pointer checks, and arithmetic checks. When a null -pointer is dereferenced by JITted code, the CPU will notify the kernel through -an interrupt, and the kernel will send a SIGSEGV signal to the process. The -runtime installs a signal handler for SIGSEGV, which is -sigsegv_signal_handler () in mini.c. The signal handler creates the appropriate -exception object and calls mono_handle_exception () with it. Arithmetic -exceptions like division by zero are handled similarly. - -Synchronous signals in native code ----------------------------------- - -Receiving a signal such as SIGSEGV while in native code means something very -bad has happened. Because of this, the runtime will abort after trying to print a -managed plus a native stack trace. The logic is in the mono_handle_native_sigsegv () -function. -Note that there are two kinds of native code which can be the source of the signal: -- code inside the runtime -- code inside a native library loaded by an application, ie. libgtk+ - -Stack overflow checking ------------------------ - - Stack overflow exceptions need special handling. When a thread overflows its -stack, the kernel sends it a normal SIGSEGV signal, but the signal handler -tries to execute on the same as the thread leading to a further SIGSEGV which -will terminate the thread. A solution is to use an alternative signal stack -supported by UNIX operating systems through the sigaltstack (2) system call. -When a thread starts up, the runtime will install an altstack using the -mono_setup_altstack () function in mini-exceptions.c. When a SIGSEGV is -received, the signal handler checks whenever the fault address is near the -bottom of the threads normal stack. If it is, a StackOverflowException is -created instead of a NullPointerException. This exception is handled like -any other exception, with some minor differences. - Working sigaltstack support is very much os/kernel/libc dependent, so it is -disabled by default. - -Asynchronous signals --------------------- - - Async signals are used by the runtime to notify a thread that it needs to -change its state somehow. Currently, it is used for implementing -thread abort/suspend/resume. - - Handling async signals correctly is a very hard problem, since the receiving -thread can be in basically any state upon receipt of the signal. It can -execute managed code, native code, it can hold various managed/native locks, or -it can be in a process of acquiring them, it can be starting up, shutting down -etc. Most of the C APIs used by the runtime are not asynch-signal safe, -meaning it is not safe to call them from an async signal handler. In -particular, the pthread locking functions are not async-safe, so if a -signal handler interrupted code which was in the process of acquiring a lock, -and the signal handler tries to acquire a lock, the thread will deadlock. -Unfortunately, the current signal handling code does acquire locks, so -sometimes it does deadlock. - -When receiving an async signal, the signal handler first tries to determine -whenever the thread was executing managed code when it was interrupted. If -it did, then it is safe to interrupt it, so a ThreadAbortException is -constructed and thrown. If the thread was executing native code, then it is -generally not safe to interrupt it. In this case, the runtime sets a flag -then returns from the signal handler. That flag is checked every time the -runtime returns from native code to managed code, and the exception is thrown -then. Also, a platform specific mechanism is used to cause the thread to -interrupt any blocking operation it might be doing. - -The async signal handler is in sigusr1_signal_handler () in mini.c, while -the logic which determines whenever an exception is safe to be thrown is in -mono_thread_request_interruption (). - -Stack unwinding during exception handling ------------------------------------------ - -The execution state of a thread during exception handling is stored in an -arch-specific structure called MonoContext. This structure contains the values -of all the CPU registers relevant during exception handling, which -usually means: -- IP (instruction pointer) -- SP (stack pointer) -- FP (frame pointer) -- callee saved registers - -Callee saved registers are the registers which are required by any procedure -to be saved/restored before/after using them. They are usually defined by -each platforms ABI (Application Binary Interface). For example, on x86, they -are EBX, ESI and EDI. - -The code which calls mono_handle_exception () is required to construct the -initial MonoContext. How this is done depends on the caller. For exceptions -thrown from managed code, the mono_arch_throw_exception helper function -saves the values of the required registers and passes them to throw_exception (), which will save them in the MonoContext structure. For exceptions thrown from -signal handlers, the MonoContext stucture is initialized from the signal info -received from the kernel. - -During exception handling, the runtime needs to 'unwind' the stack, i.e. -given the state of the thread at a stack frame, construct the state at its -callers. Since this is platform specific, it is done by a platform specific -function called mono_arch_find_jit_info (). - -Two kinds of stack frames need handling: -- Managed frames are easier. The JIT will store some information about each - managed method, like which callee-saved registers it uses. Based on this - information, mono_arch_find_jit_info () can find the values of the registers - on the thread stack, and restore them. -- Native frames are problematic, since we have no information about how to - unwind through them. Some compilers generate unwind information for code, - some don't. Also, there is no general purpose library to obtain and decode - this unwind information. So the runtime uses a different solution. When - managed code needs to call into native code, it does through a - managed->native wrapper function, which is generated by the JIT. This - function is responsible for saving the machine state into a per-thread - structure called MonoLMF (Last Managed Frame). These LMF structures are - stored on the threads stack, and are linked together using one of their - fields. When the unwinder encounters a native frame, it simply pops - one entry of the LMF 'stack', and uses it to restore the frame state to the - moment before control passed to native code. In effect, all successive native - frames are skipped together. +* Introduction +-------------- + + There are many types of exceptions which the runtime needs to + handle. These are: + + - exceptions thrown from managed code using the 'throw' or 'rethrow' CIL + instructions. + + - exceptions thrown by some IL instructions like InvalidCastException thrown + by the 'castclass' CIL instruction. + + - exceptions thrown by runtime code + + - synchronous signals received while in managed code + + - synchronous signals received while in native code + + - asynchronous signals + + Since exception handling is very arch dependent, parts of the + exception handling code reside in the arch specific + exceptions-.c files. The architecture independent parts + are in mini-exceptions.c. The different exception types listed + above are generated in different parts of the runtime, but + ultimately, they all end up in the mono_handle_exception () + function in mini-exceptions.c. + +* Exceptions throw programmatically from managed code +----------------------------------------------------- + + These exceptions are thrown from managed code using 'throw' or + 'rethrow' CIL instructions. The JIT compiler will translate + them to a call to a helper function called + 'mono_arch_throw/rethrow_exception'. + + These helper functions do not exist at compile time, they are + created dynamically at run time by the code in the + exceptions-.c files. + + They perform various stack manipulation magic, then call a + helper function usually named throw_exception (), which does + further processing in C code, then calls + mono_handle_exception() to do the rest. + +* Exceptions thrown implicitly from managed code +------------------------------------------------ + + These exceptions are thrown by some IL instructions when + something goes wrong. When the JIT needs to throw such an + exception, it emits a forward conditional branch and remembers + its position, along with the exception which needs to be + emitted. This is usually done in macros named + EMIT_COND_SYSTEM_EXCEPTION in the mini-.c files. + + After the machine code for the method is emitted, the JIT + calls the arch dependent mono_arch_emit_exceptions () function + which will add the exception throwing code to the end of the + method, and patches up the previous forward branches so they + will point to this code. + + This has the advantage that the rarely-executed exception + throwing code is kept separate from the method body, leading + to better icache performance. + + The exception throwing code braches to the dynamically + generated mono_arch_throw_corlib_exception helper function, + which will create the proper exception object, does some stack + manipulation, then calls throw_exception (). + +* Exceptions thrown by runtime code +----------------------------------- + + These exceptions are usually thrown by the implementations of + InternalCalls (icalls). First an appropriate exception object + is created with the help of various helper functions in + metadata/exception.c, which has a separate helper function for + allocating each kind of exception object used by the runtime + code. Then the mono_raise_exception () function is called to + actually throw the exception. That function never returns. + + An example: + + if (something_is_wrong) + mono_raise_exception (mono_get_exception_index_out_of_range ()); + + mono_raise_exception () simply passes the exception to the JIT + side through an API, where it will be received by helper + created by mono_arch_throw_exception (). From now on, it is + treated as an exception thrown from managed code. + +* Synchronous signals +--------------------- + + For performance reasons, the runtime does not do same checks + required by the CLI spec. Instead, it relies on the CPU to do + them. The two main checks which are omitted are null-pointer + checks, and arithmetic checks. When a null pointer is + dereferenced by JITted code, the CPU will notify the kernel + through an interrupt, and the kernel will send a SIGSEGV + signal to the process. The runtime installs a signal handler + for SIGSEGV, which is sigsegv_signal_handler () in mini.c. The + signal handler creates the appropriate exception object and + calls mono_handle_exception () with it. Arithmetic exceptions + like division by zero are handled similarly. + +* Synchronous signals in native code +------------------------------------ + + Receiving a signal such as SIGSEGV while in native code means + something very bad has happened. Because of this, the runtime + will abort after trying to print a managed plus a native stack + trace. The logic is in the mono_handle_native_sigsegv () + function. + + Note that there are two kinds of native code which can be the + source of the signal: + + - code inside the runtime + - code inside a native library loaded by an application, ie. libgtk+ + +* Stack overflow checking +------------------------- + + Stack overflow exceptions need special handling. When a thread + overflows its stack, the kernel sends it a normal SIGSEGV + signal, but the signal handler tries to execute on the same as + the thread leading to a further SIGSEGV which will terminate + the thread. A solution is to use an alternative signal stack + supported by UNIX operating systems through the sigaltstack + (2) system call. When a thread starts up, the runtime will + install an altstack using the mono_setup_altstack () function + in mini-exceptions.c. When a SIGSEGV is received, the signal + handler checks whenever the fault address is near the bottom + of the threads normal stack. If it is, a + StackOverflowException is created instead of a + NullPointerException. This exception is handled like any other + exception, with some minor differences. + + There are two reasons why sigaltstack is disabled by default: + + * The main problem with sigaltstack() is that the stack + employed by it is not visible to the GC and it is possible + that the GC will miss it. + + * Working sigaltstack support is very much os/kernel/libc + dependent, so it is disabled by default. + + +* Asynchronous signals +---------------------- + Async signals are used by the runtime to notify a thread that + it needs to change its state somehow. Currently, it is used + for implementing thread abort/suspend/resume. + + Handling async signals correctly is a very hard problem, + since the receiving thread can be in basically any state upon + receipt of the signal. It can execute managed code, native + code, it can hold various managed/native locks, or it can be + in a process of acquiring them, it can be starting up, + shutting down etc. Most of the C APIs used by the runtime are + not asynch-signal safe, meaning it is not safe to call them + from an async signal handler. In particular, the pthread + locking functions are not async-safe, so if a signal handler + interrupted code which was in the process of acquiring a lock, + and the signal handler tries to acquire a lock, the thread + will deadlock. Unfortunately, the current signal handling + code does acquire locks, so sometimes it does deadlock. + + When receiving an async signal, the signal handler first tries + to determine whenever the thread was executing managed code + when it was interrupted. If it did, then it is safe to + interrupt it, so a ThreadAbortException is constructed and + thrown. If the thread was executing native code, then it is + generally not safe to interrupt it. In this case, the runtime + sets a flag then returns from the signal handler. That flag is + checked every time the runtime returns from native code to + managed code, and the exception is thrown then. Also, a + platform specific mechanism is used to cause the thread to + interrupt any blocking operation it might be doing. + + The async signal handler is in sigusr1_signal_handler () in + mini.c, while the logic which determines whenever an exception + is safe to be thrown is in mono_thread_request_interruption + (). + +* Stack unwinding during exception handling +------------------------------------------- + + The execution state of a thread during exception handling is + stored in an arch-specific structure called MonoContext. This + structure contains the values of all the CPU registers + relevant during exception handling, which usually means: + + - IP (instruction pointer) + - SP (stack pointer) + - FP (frame pointer) + - callee saved registers + + Callee saved registers are the registers which are required by + any procedure to be saved/restored before/after using + them. They are usually defined by each platforms ABI + (Application Binary Interface). For example, on x86, they are + EBX, ESI and EDI. + + The code which calls mono_handle_exception () is required to + construct the initial MonoContext. How this is done depends on + the caller. For exceptions thrown from managed code, the + mono_arch_throw_exception helper function saves the values of + the required registers and passes them to throw_exception (), + which will save them in the MonoContext structure. For + exceptions thrown from signal handlers, the MonoContext + stucture is initialized from the signal info received from the + kernel. + + During exception handling, the runtime needs to 'unwind' the + stack, i.e. given the state of the thread at a stack frame, + construct the state at its callers. Since this is platform + specific, it is done by a platform specific function called + mono_arch_find_jit_info (). + + Two kinds of stack frames need handling: + + - Managed frames are easier. The JIT will store some + information about each managed method, like which + callee-saved registers it uses. Based on this information, + mono_arch_find_jit_info () can find the values of the + registers on the thread stack, and restore them. + + - Native frames are problematic, since we have no information + about how to unwind through them. Some compilers generate + unwind information for code, some don't. Also, there is no + general purpose library to obtain and decode this unwind + information. So the runtime uses a different solution. When + managed code needs to call into native code, it does through + a managed->native wrapper function, which is generated by + the JIT. This function is responsible for saving the machine + state into a per-thread structure called MonoLMF (Last + Managed Frame). These LMF structures are stored on the + threads stack, and are linked together using one of their + fields. When the unwinder encounters a native frame, it + simply pops one entry of the LMF 'stack', and uses it to + restore the frame state to the moment before control passed + to native code. In effect, all successive native frames are + skipped together. + Problems/future work -------------------- 1. Async signal safety ---------------------- -The current async signal handling code is not async safe, so it can and does -deadlock in practice. It needs to be rewritten to avoid taking locks at least -until it can determine that it was interrupting managed code. - -Another problem is the managed stack frame unwinding code. It blindly assumes -that if the IP points into a managed frame, then all the callee saved -registers + the stack pointer are saved on the stack. This is not true if -the thread was interrupted while executing the method prolog/epilog. - + The current async signal handling code is not async safe, so + it can and does deadlock in practice. It needs to be rewritten + to avoid taking locks at least until it can determine that it + was interrupting managed code. + + Another problem is the managed stack frame unwinding code. It + blindly assumes that if the IP points into a managed frame, + then all the callee saved registers + the stack pointer are + saved on the stack. This is not true if the thread was + interrupted while executing the method prolog/epilog. + 2. Raising exceptions from native code -------------------------------------- -Currently, exceptions are raised by calling mono_raise_exception () in -the middle of runtime code. This has two problems: -- No cleanup is done, ie. if the caller of the function which throws an - exception has taken locks, or allocated memory, that is not cleaned up. For - this reason, it is only safe to call mono_raise_exception () 'very close' to - managed code, ie. in the icall functions themselves. -- To allow mono_raise_exception () to unwind through native code, we need to - save the LMF structures which can add a lot of overhead even in the common - case when no exception is thrown. So this is not zero-cost exception handling. - - An alternative might be to use a JNI style set-pending-exception API. -Runtime code could call mono_set_pending_exception (), then return to its -caller with an error indication allowing the caller to clean up. When execution -returns to managed code, then managed->native wrapper could check whenever -there is a pending exception and throw it if neccesary. Since we already check -for pending thread interruption, this would have no overhead, allowing us -to drop the LMF saving/restoring code, or significant parts of it. - + Currently, exceptions are raised by calling + mono_raise_exception () in the middle of runtime code. This + has two problems: + + - No cleanup is done, ie. if the caller of the function which + throws an exception has taken locks, or allocated memory, + that is not cleaned up. For this reason, it is only safe to + call mono_raise_exception () 'very close' to managed code, + ie. in the icall functions themselves. + + - To allow mono_raise_exception () to unwind through native + code, we need to save the LMF structures which can add a lot + of overhead even in the common case when no exception is + thrown. So this is not zero-cost exception handling. + + An alternative might be to use a JNI style + set-pending-exception API. Runtime code could call + mono_set_pending_exception (), then return to its caller with + an error indication allowing the caller to clean up. When + execution returns to managed code, then managed->native + wrapper could check whenever there is a pending exception and + throw it if neccesary. Since we already check for pending + thread interruption, this would have no overhead, allowing us + to drop the LMF saving/restoring code, or significant parts of + it. + 4. libunwind ------------ -There is an OSS project called libunwind which is a standalone stack unwinding -library. It is currently in development, but it is used by default by gcc on -ia64 for its stack unwinding. The mono runtime also uses it on ia64. It has -several advantages in relation to our current unwinding code: -- it has a platform independent API, i.e. the same unwinding code can be used - on multiple platforms. -- it can generate unwind tables which are correct at every instruction, i.e. - can be used for unwinding from async signals. -- given sufficient unwind info generated by a C compiler, it can unwind through - C code. -- most of its API is async-safe -- it implements the gcc C++ exception handling API, so in theory it can - be used to implement mixed-language exception handling (i.e. C++ exception - caught in mono, mono exception caught in C++). -- it is MIT licensed - -The biggest problem with libuwind is its platform support. ia64 support is -complete/well tested, while support for other platforms is missing/incomplete. - -http://www.hpl.hp.com/research/linux/libunwind/ + There is an OSS project called libunwind which is a standalone + stack unwinding library. It is currently in development, but + it is used by default by gcc on ia64 for its stack + unwinding. The mono runtime also uses it on ia64. It has + several advantages in relation to our current unwinding code: + + - it has a platform independent API, i.e. the same unwinding + code can be used on multiple platforms. + + - it can generate unwind tables which are correct at every + instruction, i.e. can be used for unwinding from async + signals. + + - given sufficient unwind info generated by a C compiler, it + can unwind through C code. + + - most of its API is async-safe + + - it implements the gcc C++ exception handling API, so in + theory it can be used to implement mixed-language exception + handling (i.e. C++ exception caught in mono, mono exception + caught in C++). + - it is MIT licensed + + The biggest problem with libuwind is its platform support. ia64 support is + complete/well tested, while support for other platforms is missing/incomplete. + + http://www.hpl.hp.com/research/linux/libunwind/ +