1 \documentclass[twocolumn,a4paper]{article} % -*- latex -*-
5 {\tt http://www.unix.cslab.tuwien.ac.at/\char'176schani/}
7 \title{Threads in Cacao}
9 \newcommand{\globvarslabel}[1]{\mbox{\texttt{#1}}\hfil}
10 \newenvironment{globvars}
12 {\renewcommand{\makelabel}{\globvarslabel}%
17 \newcommand{\structdesclabel}[1]{\mbox{\texttt{#1}}\hfil}
18 \newenvironment{structdesc}
20 {\renewcommand{\makelabel}{\structdesclabel}%
32 To fulfill its role as a Java runtime environment, Cacao has to
33 provide support for threads. A thread in Java is what is generally
34 understood as a 'thread' in computer science: a context of execution
35 within a process that shares resources (virtual memory, file
36 descriptors, \dots ) with all other threads in the same process.
38 The Java Virtual Machine Specification does not specify how threads
39 should be implemented, so we had two alternatives to choose from:
42 \item Implementing Java threads using the corresponding thread implementation of the
43 underlying operating system.
44 \item Using just one operating system thread and implementing Java threads by doing
48 A third alternative, namely using processes to emulate threads, is not
49 viable, because the sharing of all resources between processes (at
50 least UNIX processes) is almost impossible.
52 Both alternatives have their own advantages and drawbacks. Using
53 operating system threads avoids implementing all the gory details of
54 threads and gives scheduling control to the operating system, meaning
55 that it is not only preemptive but also multi-processor-aware.
56 Rolling it our own, on the other hand, does not waste operating system
57 resources (imagine a Java application that spawns over a hundred
58 threads!) and gives us much more control over thread-switching, which
59 makes it a lot easier to avoid synchronization problems. Furthermore,
60 it makes it possible to port Cacao to platforms which do not support
61 threads at the operating system level.
63 Since there already was a user-level thread implementation for a Java
64 virtual machine available as free software, we chose to build upon it,
65 thereby avoiding all the nasty coding details, without giving up the
66 advantages of user-level threads. We did, however, leave open the
67 possibility of replacing the user-level threads with an operating
68 system implementation.
70 As outlined in the previous paragraph, the threading code of Cacao is
71 based on the code of Tim Wilkinson's Kaffe (version 0.7), which has
72 been released under a BSD-style license. We had to make a few
73 adjustments to the code, mainly making it '64-bit-aware'.
75 \section{Main Parts of the System}
78 The bulk of the threading system is located in the \texttt{threads/}
79 directory of the Cacao distribution. \texttt{threads/thread.c}
80 contains the thread management code (starting, suspending, killing,
81 \dots ), \texttt{threads/locks.c} contains the implementation of
82 synchronization primitives, which are discussed in more detail in
83 section~\ref{sec:synchronization}.
84 \texttt{threads/threadio.c} implements thread-aware I/O, which is detailed in
87 The platform dependent parts of the system (which are pretty few) are
88 to be found in the files \texttt{sysdep/threads.h} and
89 \texttt{sysdep/asmpart.c}. They are discussed in
90 section~\ref{sec:threadswitching}.
92 There are a few other locations where changes had to be made for
93 threading support which are described in
94 sections~\ref{sec:criticalsections} and~\ref{sec:gc}.
96 \section{Data Structures}
97 \label{sec:datastructures}
99 The \texttt{java.lang.Thread} class, as defined in Sunsoft's Java
100 Runtime Library, provides a private instance variable
101 \texttt{PrivateInfo} of pointer type, which is reserved for the
102 purposes of the runtime system. Cacao uses the integer value of
103 \texttt{PrivateInfo} as an index into the array \texttt{contexts[]}
104 (see section~\ref{sec:globalvars}), which contains context information
105 (most notably priority and stack information) of all live threads:
119 java_objectheader *exceptionptr;
120 struct _thread *nextlive;
127 \item[free] Is \texttt{true} if this \texttt{ctx} is associated with a thread.
129 \item[status] One of \texttt{THREAD\_SUSPENDED}, \texttt{THREAD\_RUNNING} and
130 \texttt{THREAD\_DEAD}.
132 \item[priority] The priority of the thread. Must be in the range \texttt{MIN\_PRIORITY} (1) to
133 \texttt{MAX\_PRIORITY} (10).
135 \item[restorePoint] Top of the stack of the corresponding thread.
137 \item[stackMem] Begin of the region of memory allocated for the stack, including the
140 \item[stackBase] Begin of the region of memory that can actually be used for the stack (i.e.
141 the memory right after the guarding pages).
143 \item[stackEnd] Pointer to the first byte of memory after the region allocated for the stack.
145 \item[usedStackTop] Pointer to the actual top of the stack. Only valid if the corresponding
146 thread is not running.
148 \item[time] Not used yet.
150 \item[exceptionptr] Saves the \texttt{exceptionptr} of the corresponding thread if it is
153 \item[nextlive] Maintains a link to the next thread in the \texttt{livethreads} list
154 (see section~\ref{sec:globalvars}).
156 \item[flags] Bitwise combination of several flags. If \texttt{THREAD\_FLAGS\_NOSTACKALLOC}
157 is set, then the thread stack should not be dealloced upon thread death.
158 \texttt{THREAD\_FLAGS\_USER\_SUSPEND} is set after the thread has been suspended by the user
159 (i.e. the thread was not suspended by a blocking operation). \texttt{THREAD\_FLAGS\_KILLED}
160 seems not to be used.
165 \label{sec:scheduling}
167 Thread scheduling is performed by the routine
168 \texttt{reschedule()}. It checks the lists
169 \texttt{threadQhead[]}, beginning with the highest priority. If a thread is found, it is
170 immediately switched to. This means that higher-priority threads
171 always take precedence over lower-priority ones. A low-priority thread
172 will only run if all higher-priority ones are suspended (e.g. through
175 \section{Thread-Switching}
176 \label{sec:threadswitching}
178 Thread-switching is the only platform-dependent part of the Cacao
179 threading system. Currently, only an Alpha implementation is
182 The thread-switch is a function (written in assembly language), which
183 does nothing more than to push all callee-saved registers to the
184 stack, switch to the stack of the new thread, pop all callee-saved
185 registers (from the switched-to stack) and return. The only problem is
186 the setting up of a new thread stack, which needs to contain values
187 for the callee-saved registers and the address of the thread-entry
188 routine (\texttt{firstStartThread()}) as the return address of the
189 function call. See the files \texttt{alpha/threads.h} and
190 \texttt{alpha/asmpart.c} for how this is managed in the Alpha
193 \section{Critical Sections}
194 \label{sec:criticalsections}
196 Although the threading system is in its current implementation not
197 preemptive, this is planned. It will be utilizing alarm timer signals
198 to preempt the running thread at certain points in time.
200 In order to write reliable code in a preemptive environment, it is
201 necessary to make it possible to prohibit preemption during certain
202 critical sections. For this purpose two macros (\texttt{intsDisable()}
203 and \texttt{intsRestore()}) are defined that can be used to enter and
204 to exit critical sections which are guaranteed not to be
205 preempted. Critical sections can be nested.
207 Apart from the code for mutexes and conditions (see
208 section~\ref{sec:synchronization}) the whole just-in-time compiler is
209 not to be preempted since it is not only not reentrant but does also
210 alter tables that may crash other threads if not in a consistent
211 state. This is one of the main problems that must be solved when
212 attempting to use operating-system threads in Cacao.
214 The other critical part outside of the threading system is the memory
215 management. Allocation on the heap is regarded as a critical section
216 as well as garbage collection which is further described in
217 section~\ref{sec:gc}.
219 \section{Thread Stacks}
220 \label{sec:threadstacks}
222 UNIX provides for each process one stack that dynamically grows when
223 its top is reached. Since there can be more than one thread running
224 in a Cacao process, there must be a way of providing stacks for the
225 other threads. Cacao does this by simply allocating a region of memory
226 and using it as the stack for a thread. This stack has, unfortunately,
227 not the capability of growing dynamically. Worse yet, since it is
228 allocated on the heap, there is not even a way of checking the stack
229 for overflow (except from checking upon each entry to a subroutine,
230 which is too much a penalty). The solution to the latter problem is
231 the allocation of additional guarding pages above the top of the stack
232 (i.e. before the stack). These pages are then (by virtue of the
233 \texttt{mprotect(2)} system call) made inaccessible to the process,
234 resulting in a \texttt{SEGV} signal when the stack has over-flown.
236 The size of the thread stacks can be altered with the \texttt{-tss}
237 command line option. The default size is 32k. The number of guarding
238 pages is by default 1 and can be changed with the
239 \texttt{-ngp} command line option (not yet implemented).
241 \section{Mutexes and Conditions}
242 \label{sec:synchronization}
244 In Java, each and every object has a mutex (used for the
245 \texttt{synchronized} directive) and a condition variable (used by the
246 \texttt{wait()}, \texttt{notify()} and
247 \texttt{notifyAll()} methods) attached. Since we did not want to include
248 these two data structures in each object, as they take up rather a lot
249 of space on a 64-bit architecture, we decided to create them on demand
250 and keep them in a hash table that is indexed by the address of the
251 object the mutex/condition belongs to. This way, we are able to
252 allocate these structures on-demand: an object's mutex is allocated
253 not upon object creation but when the mutex is locked. When the mutex
254 is unlocked, it is freed again. For detailed information on this
255 topic, see the paper `Monitors and Exceptions: How to efficiently
258 Following are the structures of mutexes and conditions:
261 typedef struct _iMux {
262 struct _thread* holder;
264 struct _thread* muxWaiters;
270 \item[holder] Points to the thread that has currently locked the mutex. If the mutex is not
271 locked, \texttt{holder} is \texttt{0}.
273 \item[count] How often the mutex is locked by the current thread. It is incremented each time
274 the mutex is locked and decremented when the mutex is
275 unlocked. \texttt{0} if the mutex is unlocked.
277 \item[muxWaiters] List of threads that are waiting on the mutex to become unlocked.
282 typedef struct _iCv {
283 struct _thread* cvWaiters;
290 \item[cvWaiters] List of threads waiting on the condition to be signalled.
292 \item[mux] Mutex which was used to wait on the condition.
299 UNIX I/O is by default blocking, which is suboptimal for user-level
300 threads, since a blocking I/O call would not only block its own thread
301 but the whole process. For this purpose Cacao makes all file
302 descriptors use non-blocking I/O and, whenever a thread wants to make
303 an I/O call that would block, suspends this thread and does not resume
304 it until the call would not require blocking. In order to check that
305 condition, Cacao does a \texttt{select()} with all pending file
306 descriptors (in function \texttt{checkEvents()}) whenever a thread
307 switch is requested and resumes all threads that can continue.
309 \section{Garbage Collection}
312 Garbage collection requires special attention in a threaded
313 environment and more so in Cacao. Cacao uses a recursive
314 mark-and-sweep garbage collector, which means that it is not reentrant
315 and may use up a lot of stack. This means not only that it has to run
316 without been preempted but also that it needs to run in a thread with
317 a lot of stack available, which is not the case for a standard
318 thread. The best solution is obviously to run it on the stack of the
319 main thread which grows automatically. This is exactly what Cacao
320 does. Since the stack tops of all threads are known at any time, it is
321 a trivial task to switch over to another stack. This is handled by the
322 system dependent function \texttt{asm\_switchstackandcall} (defined in
323 \texttt{sysdep/asmpart.c}).
325 Another problem that needs to be taken care of is that all objects
326 that are referenced by any live thread need to be marked. Furthermore,
327 \texttt{Thread} objects that correspond to live threads must also be
328 marked, even if they are not referenced by any thread (which is
329 possible). To serve both purposes the list \texttt{liveThreads}
330 contains all live threads.
332 \section{Global Variables}
333 \label{sec:globalvars}
337 \item[int blockInts] Indicates whether execution is within a critical section. \texttt{0} if
338 not, otherwise \texttt{>0}, depending on the nesting level.
340 \item[bool needReschedule] Is \texttt{true} if a thread switch should occur upon exit of
341 the currently executed critical section, otherwise \texttt{false}.
343 \item[thread *currentThread] The currently executed thread.
345 \item[ctx contexts[\texttt{]}] Array containing \texttt{MAXTHREADS} thread contexts.
347 \texttt{free==false} are not occupied. The fact that this array has a fixed size means
348 that there is maximum number of threads. This may change in the future.
350 \item[thread *liveThreads] List of all live threads. The list is linked through the
351 \texttt{nextlive} member of the corresponding contexts of the threads.
353 \item[thread *threadQhead[\texttt{]}] Array of lists of not-suspended live threads. Indexed
362 \item Cross-check threading code with newest release of Kaffe.
364 \item Preemptive multi-threading through alarm-timers.
366 \item Implement thread-table more efficiently (maintain free-list).
368 \item Implement some missing command-line switches (\texttt{getopt} would be cool).