6 A \textit{Java Virtual Machine} (JVM) dynamically loads, links and
7 initializes classes and interfaces when they are needed. Loading a
8 class or interface means locating the binary representation---the
9 class files---and creating a class of interface structure from that
10 binary representation. Linking takes a loaded class or interface and
11 transfers it into the runtime state of the \textit{Java Virtual
12 Machine} so that it can be executed. Initialization of a class or
13 interface means executing the static class of interface initializer
16 The following sections describe the process of loading, linking and
17 initalizing a class or interface in the CACAO \textit{Java Virtual
18 Machine} in greater detail. Further the used data structures and
19 techniques used in CACAO and the interaction with the GNU classpath
23 \section{System class loader}
25 The class loader of a \textit{Java Virtual Machine} (JVM) is
26 responsible for loading all type of classes and interfaces into the
27 runtime system of the JVM. Every JVM has a \textit{system class
28 loader} which is implemented in \texttt{java.lang.ClassLoader} and
29 this class interacts via native function calls with the JVM itself.
33 The \textit{GNU classpath} implements the system class loader in
34 \texttt{gnu.java.lang.SystemClassLoader} which extends
35 \texttt{java.lang.ClassLoader} and interacts with the JVM. The
36 \textit{bootstrap class loader} is implemented in
37 \texttt{java.lang.ClassLoader} plus the JVM depended class
38 \texttt{java.lang.VMClassLoader}. \texttt{java.lang.VMClassLoader} is
39 the main class how the bootstrap class loader of the GNU classpath
40 interacts with the JVM. The main functions of this class is
45 static final native Class loadClass(String name, boolean resolve)
46 throws ClassNotFoundException;
51 This is a native function implemented in the CACAO JVM, which is
52 located in \texttt{nat/VMClassLoader.c} and calls the internal loader
53 functions of CACAO. If the \texttt{name} argument is \texttt{NULL}, a
54 new \texttt{java.lang.NullPointerException} is created and the
55 function returns \texttt{NULL}.
59 If the \texttt{name} is non-NULL a new UTF8 string of the class' name
60 is created in the internal \textit{symbol table} via
63 utf *javastring_toutf(java_lang_String *string, bool isclassname);
66 This function converts a \texttt{java.lang.String} string into the
67 internal used UTF8 string representation. \texttt{isclassname} tells
68 the function to convert any \texttt{.} (periods) found in the class
69 name into \texttt{/} (slashes), so the class loader can find the
72 Then a new \texttt{classinfo} structure is created via the
75 classinfo *class_new(utf *classname);
78 function call. This function creates a unique representation of this
79 class, identified by its name, in the JVM's internal \textit{class
80 hashtable}. The newly created \texttt{classinfo} structure is
81 initialized with correct values, like \texttt{loaded = false;},
82 \texttt{linked = false;} and \texttt{initialized = false;}. This
83 guarantees a definite state of a new class.
85 The next step is to actually load the class requested. Thus the main
89 classinfo *class_load(classinfo *c);
92 is called, which is a wrapper function to the real loader function
95 classinfo *class_load_intern(classbuffer *cb);
98 This wrapper function is required to ensure some requirements:
101 \item enter a monitor on the \texttt{classinfo} structure, so that
102 only one thread can load the same class at the same time
104 \item initialize the \texttt{classbuffer} structure with the actual
107 \item remove the \texttt{classinfo} structure from the internal table
108 if we got an exception during loading
110 \item free any allocated memory and leave the monitor
113 The \texttt{class\_load\_intern} functions preforms the actual loading
114 of the binary representation of the class or interface. During loading
115 some verifier checks are performed which can throw an error. This
116 error can be a \texttt{java.lang.ClassFormatError} or a
117 \texttt{java.lang.NoClassDefFoundError}. Some of these
118 \texttt{java.lang.ClassFormatError} checks are
121 \item \textit{Truncated class file} --- unexpected end of class file
124 \item \textit{Bad magic number} --- class file does not start with
125 the magic bytes (\texttt{0xCAFEBABE})
127 \item \textit{Unsupported major.minor version} --- the bytecode
128 version of the given class file is not supported by the JVM
131 The actual loading of the bytes from the binary representation is done
132 via the \texttt{suck\_*} functions. These functions are
135 \item \texttt{suck\_u1}: load one \texttt{unsigned byte} (8 bit)
137 \item \texttt{suck\_u2}: load two \texttt{unsigned byte}s (16 bit)
139 \item \texttt{suck\_u4}: load four \texttt{unsigned byte}s (32 bit)
141 \item \texttt{suck\_u8}: load eight \texttt{unsigned byte}s (64 bit)
143 \item \texttt{suck\_float}: load four \texttt{byte}s (32 bit)
144 converted into a \texttt{float} value
146 \item \texttt{suck\_double}: load eight \texttt{byte}s (64 bit)
147 converted into a \texttt{double} value
149 \item \texttt{suck\_nbytes}: load \textit{n} bytes
152 Loading \texttt{signed} values is done via the
153 \texttt{suck\_s[1,2,4,8]} macros which cast the loaded bytes to
154 \texttt{signed} values. All these functions take a
155 \texttt{classbuffer}~(Figure \ref{classbuffer}) structure pointer as
160 typedef struct classbuffer {
161 classinfo *class; /* pointer to classinfo structure */
162 u1 *data; /* pointer to byte code */
163 s4 size; /* size of the byte code */
164 u1 *pos; /* current read position */
167 \caption{\texttt{classbuffer} structure}
171 This \texttt{classbuffer} structure is filled with data via the
174 classbuffer *suck_start(classinfo *c);
177 function. This function tries to locate the class, specifed with the
178 \texttt{classinfo} structure, in the \texttt{CLASSPATH}. This can be
179 a plain class file in the filesystem or a file in a
180 \texttt{zip}/\texttt{jar} file. If the class file is found, the
181 \texttt{classbuffer} is filled with data collected from the class
182 file, including the class file size and the binary representation of
185 Before reading any byte of the binary representation with a
186 \texttt{suck\_*} function, the remaining bytes in the
187 \texttt{classbuffer} data array must be checked with the
190 static inline bool check_classbuffer_size(classbuffer *cb, s4 len);
193 function. If the remaining bytes number is less than the amount of the
194 bytes to be read, specified by the \texttt{len} argument, a
195 \texttt{java.lang.ClassFormatError} with the detail message
196 \textit{Truncated class file}---as mentioned before---is thrown.
199 \subsection{Constant pool loading}
201 The class' constant pool is loaded via
204 static bool class_loadcpool(classbuffer *cb, classinfo *c);
207 from the \texttt{constant\_pool} table in the binary representation of
208 the class of interface. The constant pool needs to be parsed in two
209 passes. In the first pass the information loaded is saved in temporary
210 structures, which are further processed in the second pass, when the
211 complete constant pool has been traversed. Only when the whole
212 constant pool entries have been loaded, any constant pool entry can be
213 completely resolved, but this resolving can only be done in a specific
217 \item \texttt{CONSTANT\_Class}
219 \item \texttt{CONSTANT\_String}
221 \item \texttt{CONSTANT\_NameAndType}
223 \item \texttt{CONSTANT\_Fieldref}, \texttt{CONSTANT\_Methodref} and
224 \texttt{CONSTANT\_InterfaceMethodref} --- these are combined into one
230 The remaining constant pool types \texttt{CONSTANT\_Integer},
231 \texttt{CONSTANT\_Float}, \texttt{CONSTANT\_Long},
232 \texttt{CONSTANT\_Double} and \texttt{CONSTANT\_Utf8} can be
233 completely resolved in the first pass and need no further processing.
237 These are the temporary structures used to \textit{forward} the data
238 from the first pass into the second:
241 /* CONSTANT_Class entries */
242 typedef struct forward_class {
243 struct forward_class *next;
248 /* CONSTANT_String */
249 typedef struct forward_string {
250 struct forward_string *next;
255 /* CONSTANT_NameAndType */
256 typedef struct forward_nameandtype {
257 struct forward_nameandtype *next;
261 } forward_nameandtype;
263 /* CONSTANT_Fieldref, CONSTANT_Methodref or CONSTANT_InterfaceMethodref */
264 typedef struct forward_fieldmethint {
265 struct forward_fieldmethint *next;
269 u2 nameandtype_index;
270 } forward_fieldmethint;
273 The \texttt{classinfo} structure has two pointers to arrays which
274 contain the class' constant pool infos, namely: \texttt{cptags} and
275 \texttt{cpinfos}. \texttt{cptags} contains the type of the constant
276 pool entry. \texttt{cpinfos} contains a pointer to the constant pool
277 entry itself. In the second pass the references are resolved and the
278 runtime structures are created. In further detail this includes for
281 \item \texttt{CONSTANT\_Class}: get the UTF8 name string of the
282 class, store type \texttt{CONSTANT\_Class} in \texttt{cptags}, create
283 a class in the class hashtable with the UTF8 name and store the
284 pointer to the new class in \texttt{cpinfos}
286 \item \texttt{CONSTANT\_String}: get the UTF8 string of the
287 referenced string, store type \texttt{CONSTANT\_String} in
288 \texttt{cptags} and store the UTF8 string pointer into
293 \item \texttt{CONSTANT\_NameAndType}: create a
294 \texttt{constant\_nameandtype}~(Figure \ref{constantnameandtype})
295 structure, get the UTF8 name and description string of the field or
296 method and store them into the \texttt{constant\_nameandtype}
297 structure, store type \texttt{CONSTANT\_NameAndType} into
298 \texttt{cptags} and store a pointer to the
299 \texttt{constant\_nameandtype} structure into \texttt{cpinfos}
305 typedef struct { /* NameAndType (Field or Method) */
306 utf *name; /* field/method name */
307 utf *descriptor; /* field/method type descriptor string */
308 } constant_nameandtype;
310 \caption{\texttt{constant\_nameandtype} structure}
311 \label{constantnameandtype}
316 \item \texttt{CONSTANT\_Fieldref}, \texttt{CONSTANT\_Methodref} and
317 \texttt{CONSTANT\_InterfaceMethodref}: create a
318 \texttt{constant\_FMIref}~(Figure \ref{constantFMIref}) structure,
319 get the referenced \texttt{constant\_nameandtype} structure which
320 contains the name and descriptor resolved in a previous step and
321 store the name and descriptor into the \texttt{constant\_FMIref}
322 structure, get the pointer of the referenced class, which was created
323 in a previous step, and store the pointer of the class into the
324 \texttt{constant\_FMIref} structure, store the type of the current
325 constant pool entry in \texttt{cptags} and store a pointer to
326 \texttt{constant\_FMIref} in \texttt{cpinfos}
332 typedef struct { /* Fieldref, Methodref and InterfaceMethodref */
333 classinfo *class; /* class containing this field/method/interface */
334 utf *name; /* field/method/interface name */
335 utf *descriptor; /* field/method/interface type descriptor string */
338 \caption{\texttt{constant\_FMIref} structure}
339 \label{constantFMIref}
344 Any UTF8 strings, \texttt{constant\_nameandtype} structures or
345 referenced classes are resolved with the
348 voidptr class_getconstant(classinfo *c, u4 pos, u4 ctype);
351 function. This functions checks for type equality and then returns the
352 requested \texttt{cpinfos} slot of the specified class.
355 \subsection{Interface resolving}
357 The interface classes are resolved with \texttt{class\_getconstant}
358 from the class' constant pool. After reading the number of interfaces,
359 for every interface referenced a \texttt{u2} index number is read from
360 the currently loading class or interface file, which is the index used
361 to resolve the class from the constant pool.
364 \subsection{Field loading}
366 The number of fields of the class or interface is read as \texttt{u2}
367 value. For each field the function
370 static bool field_load(classbuffer *cb, classinfo *c, fieldinfo *f);
373 is called. The \texttt{fieldinfo *} argument is a pointer to a
374 \texttt{fieldinfo} structure allocated by the class loader. The
375 fields' \texttt{name} and \texttt{descriptor} are resolved from the
376 class constant pool via \texttt{class\_getconstant}. If the verifier
377 option is turned on, the fields' \texttt{flags}, \texttt{name} and
378 \texttt{descriptor} are checked for validity and can result in a
379 \texttt{java.lang.ClassFormatError}.
381 Each field can have some attributes. The number of attributes is read
382 as \texttt{u2} value from the binary representation. If the field has
383 the \texttt{ACC\_FINAL} flag set, the \texttt{ConstantValue} attribute
384 is available. This is the only attribute processed by
385 \texttt{field\_load} and can occur only once, otherwise a
386 \texttt{java.lang.ClassFormatError} is thrown. The
387 \texttt{ConstantValue} entry in the constant pool contains the value
388 for the \texttt{final} field. Depending on the fields' type, the
389 proper constant pool entry is resolved and assigned.
392 \subsection{Method loading}
394 As for the fields, the number of the class or interface methods is read from
395 the binary representation as \texttt{u2} value. For each method the function
398 static bool method_load(classbuffer *cb, classinfo *c, methodinfo *m);
401 is called. The beginning of the method loading code is nearly the same
402 as the field loading code. The \texttt{methodinfo *} argument is a
403 pointer to a \texttt{methodinfo} structure allocated by the class
404 loader. The method's \texttt{name} and \texttt{descriptor} are
405 resolved from the class constant pool via
406 \texttt{class\_getconstant}. With the verifier turned on, some method
407 checks are carried out. These include \texttt{flags}, \texttt{name}
408 and \texttt{descriptor} checks and argument count check.
410 Now the method loading function has to distinguish between a
411 \texttt{native} and a normal JAVA method. Depending on the
412 \texttt{ACC\_NATIVE} flags, a different stub is created.
414 For a normal JAVA method, a \textit{compiler stub} is created. The
415 purpose of this stub is to call the CACAO jit compiler to compile the
416 JAVA method. A pointer to this compiler stub routine is used during
417 code generation as method call if the method is not compiled
418 yet. After the target method is compiled, the new entry point of the
419 method is patched into the generated code and the compiler stub is
420 needless, thus it is freed.
422 If the method is a \texttt{native} method, the loader tries to find
423 the native function. If the the function was found a \textit{native
424 stub} is generated. This stub is responsible to manipulate the
425 method's arguments to be suitable for the \texttt{native} method
426 called. This includes inserting the \textit{JNI environment} pointer
427 as first argument and, if the \texttt{native} method has the
428 \texttt{ACC\_STATIC} flag set, inserting a pointer to the methods
429 class as second argument. If the \texttt{native} method is
430 \texttt{static}, the native stub also checks if the method's class is
431 already initialized. If the method's class is not initialized as the
432 native stub is generated, a \texttt{asm\_check\_clinit} calling code
435 Each method can have some attributes.
438 \section{Data structures}
440 \section{Dynamic class loader}
442 \section{Eager - lazy class loading}
446 \section{Initialization}