6 A \textit{Java Virtual Machine} (JVM) dynamically loads, links and
7 initializes classes and interfaces when they are needed. Loading a
8 class or interface means locating the binary representation---the
9 class files---and creating a class of interface structure from that
10 binary representation. Linking takes a loaded class or interface and
11 transfers it into the runtime state of the \textit{Java Virtual
12 Machine} so that it can be executed. Initialization of a class or
13 interface means executing the static class of interface initializer
16 The following sections describe the process of loading, linking and
17 initalizing a class or interface in the CACAO \textit{Java Virtual
18 Machine} in greater detail. Further the used data structures and
19 techniques used in CACAO and the interaction with the GNU classpath
23 \section{System class loader}
24 \label{sectionsystemclassloader}
26 The class loader of a \textit{Java Virtual Machine} (JVM) is
27 responsible for loading all type of classes and interfaces into the
28 runtime system of the JVM. Every JVM has a \textit{system class
29 loader} which is implemented in \texttt{java.lang.ClassLoader} and
30 this class interacts via native function calls with the JVM itself.
34 The \textit{GNU classpath} implements the system class loader in
35 \texttt{gnu.java.lang.SystemClassLoader} which extends
36 \texttt{java.lang.ClassLoader} and interacts with the JVM. The
37 \textit{bootstrap class loader} is implemented in
38 \texttt{java.lang.ClassLoader} plus the JVM depended class
39 \texttt{java.lang.VMClassLoader}. \texttt{java.lang.VMClassLoader} is
40 the main class how the bootstrap class loader of the GNU classpath
41 interacts with the JVM. The main functions of this class is
46 static final native Class loadClass(String name, boolean resolve)
47 throws ClassNotFoundException;
52 This is a native function implemented in the CACAO JVM, which is
53 located in \texttt{nat/VMClassLoader.c} and calls the internal loader
54 functions of CACAO. If the \texttt{name} argument is \texttt{NULL}, a
55 new \texttt{java.lang.NullPointerException} is created and the
56 function returns \texttt{NULL}.
60 If the \texttt{name} is non-NULL a new UTF8 string of the class' name
61 is created in the internal \textit{symbol table} via
64 utf *javastring_toutf(java_lang_String *string, bool isclassname);
67 This function converts a \texttt{java.lang.String} string into the
68 internal used UTF8 string representation. \texttt{isclassname} tells
69 the function to convert any \texttt{.} (periods) found in the class
70 name into \texttt{/} (slashes), so the class loader can find the
73 Then a new \texttt{classinfo} structure is created via the
76 classinfo *class_new(utf *classname);
79 function call. This function creates a unique representation of this
80 class, identified by its name, in the JVM's internal \textit{class
81 hashtable}. The newly created \texttt{classinfo} structure (see
82 figure~\ref{classinfostructure}) is initialized with correct values,
83 like \texttt{loaded = false;}, \texttt{linked = false;} and
84 \texttt{initialized = false;}. This guarantees a definite state of a
89 struct classinfo { /* class structure */
91 s4 flags; /* ACC flags */
92 utf *name; /* class name */
94 s4 cpcount; /* number of entries in constant pool */
95 u1 *cptags; /* constant pool tags */
96 voidptr *cpinfos; /* pointer to constant pool info structures */
98 classinfo *super; /* super class pointer */
99 classinfo *sub; /* sub class pointer */
100 classinfo *nextsub; /* pointer to next class in sub class list */
102 s4 interfacescount; /* number of interfaces */
103 classinfo **interfaces; /* pointer to interfaces */
105 s4 fieldscount; /* number of fields */
106 fieldinfo *fields; /* field table */
108 s4 methodscount; /* number of methods */
109 methodinfo *methods; /* method table */
111 bool initialized; /* true, if class already initialized */
112 bool initializing; /* flag for the compiler */
113 bool loaded; /* true, if class already loaded */
114 bool linked; /* true, if class already linked */
115 s4 index; /* hierarchy depth (classes) or index */
117 s4 instancesize; /* size of an instance of this class */
118 #ifdef SIZE_FROM_CLASSINFO
119 s4 alignedsize; /* size of an instance, aligned to the */
120 /* allocation size on the heap */
123 vftbl_t *vftbl; /* pointer to virtual function table */
125 methodinfo *finalizer; /* finalizer method */
127 u2 innerclasscount; /* number of inner classes */
128 innerclassinfo *innerclass;
130 utf *packagename; /* full name of the package */
131 utf *sourcefile; /* classfile name containing this class */
132 java_objectheader *classloader; /* NULL for bootstrap classloader */
135 \caption{\texttt{classinfo} structure}
136 \label{classinfostructure}
139 The next step is to actually load the class requested. Thus the main
143 classinfo *class_load(classinfo *c);
146 is called, which is a wrapper function to the real loader function
149 classinfo *class_load_intern(classbuffer *cb);
152 This wrapper function is required to ensure some requirements:
155 \item enter a monitor on the \texttt{classinfo} structure, so that
156 only one thread can load the same class or interface at the same time
158 \item check if the class or interface is \texttt{loaded}, if it is
159 \texttt{true}, leave the monitor and return immediately
161 \item measure the loading time if requested
163 \item initialize the \texttt{classbuffer} structure with the actual
166 \item reset the \texttt{loaded} field of the \texttt{classinfo}
167 structure to \texttt{false} amd remove the \texttt{classinfo}
168 structure from the internal class hashtable if we got an error or
169 exception during loading
171 \item free any allocated memory
173 \item leave the monitor
176 The \texttt{class\_load} function is implemented to be
177 \textit{reentrant}. This must be the case for the \textit{eager class
178 loading} algorithm implemented in CACAO (described in more detail in
179 section \ref{sectioneagerclassloading}). Furthermore this means that
180 serveral threads can load different classes or interfaces at the same
181 time on multiprocessor machines.
183 The \texttt{class\_load\_intern} functions preforms the actual loading
184 of the binary representation of the class or interface. During loading
185 some verifier checks are performed which can throw an error. This
186 error can be a \texttt{java.lang.ClassFormatError} or a
187 \texttt{java.lang.NoClassDefFoundError}. Some of these
188 \texttt{java.lang.ClassFormatError} checks are
191 \item \textit{Truncated class file} --- unexpected end of class file
194 \item \textit{Bad magic number} --- class file does not start with
195 the magic bytes (\texttt{0xCAFEBABE})
197 \item \textit{Unsupported major.minor version} --- the bytecode
198 version of the given class file is not supported by the JVM
201 The actual loading of the bytes from the binary representation is done
202 via the \texttt{suck\_*} functions. These functions are
205 \item \texttt{suck\_u1}: load one \texttt{unsigned byte} (8 bit)
207 \item \texttt{suck\_u2}: load two \texttt{unsigned byte}s (16 bit)
209 \item \texttt{suck\_u4}: load four \texttt{unsigned byte}s (32 bit)
211 \item \texttt{suck\_u8}: load eight \texttt{unsigned byte}s (64 bit)
213 \item \texttt{suck\_float}: load four \texttt{byte}s (32 bit)
214 converted into a \texttt{float} value
216 \item \texttt{suck\_double}: load eight \texttt{byte}s (64 bit)
217 converted into a \texttt{double} value
219 \item \texttt{suck\_nbytes}: load \textit{n} bytes
222 Loading \texttt{signed} values is done via the
223 \texttt{suck\_s[1,2,4,8]} macros which cast the loaded bytes to
224 \texttt{signed} values. All these functions take a
225 \texttt{classbuffer} (see figure~\ref{classbufferstructure})
226 structure pointer as argument.
230 typedef struct classbuffer {
231 classinfo *class; /* pointer to classinfo structure */
232 u1 *data; /* pointer to byte code */
233 s4 size; /* size of the byte code */
234 u1 *pos; /* current read position */
237 \caption{\texttt{classbuffer} structure}
238 \label{classbufferstructure}
241 This \texttt{classbuffer} structure is filled with data via the
244 classbuffer *suck_start(classinfo *c);
247 function. This function tries to locate the class, specifed with the
248 \texttt{classinfo} structure, in the \texttt{CLASSPATH}. This can be
249 a plain class file in the filesystem or a file in a
250 \texttt{zip}/\texttt{jar} file. If the class file is found, the
251 \texttt{classbuffer} is filled with data collected from the class
252 file, including the class file size and the binary representation of
255 Before reading any byte of the binary representation with a
256 \texttt{suck\_*} function, the remaining bytes in the
257 \texttt{classbuffer} data array must be checked with the
260 static inline bool check_classbuffer_size(classbuffer *cb, s4 len);
263 function. If the remaining bytes number is less than the amount of the
264 bytes to be read, specified by the \texttt{len} argument, a
265 \texttt{java.lang.ClassFormatError} with the detail message
266 \textit{Truncated class file}---as mentioned before---is thrown.
268 The following subsections describe chronologically in greater detail
269 the individual loading steps of a class or interface from it's binary
273 \subsection{Constant pool loading}
274 \label{sectionconstantpoolloading}
276 The class' constant pool is loaded via
279 static bool class_loadcpool(classbuffer *cb, classinfo *c);
282 from the \texttt{constant\_pool} table in the binary representation of
283 the class of interface. The constant pool needs to be parsed in two
284 passes. In the first pass the information loaded is saved in temporary
285 structures, which are further processed in the second pass, when the
286 complete constant pool has been traversed. Only when the whole
287 constant pool entries have been loaded, any constant pool entry can be
288 completely resolved, but this resolving can only be done in a specific
292 \item \texttt{CONSTANT\_Class}
294 \item \texttt{CONSTANT\_String}
296 \item \texttt{CONSTANT\_NameAndType}
298 \item \texttt{CONSTANT\_Fieldref}, \texttt{CONSTANT\_Methodref} and
299 \texttt{CONSTANT\_InterfaceMethodref} --- these are combined into one
305 The remaining constant pool types \texttt{CONSTANT\_Integer},
306 \texttt{CONSTANT\_Float}, \texttt{CONSTANT\_Long},
307 \texttt{CONSTANT\_Double} and \texttt{CONSTANT\_Utf8} can be
308 completely resolved in the first pass and need no further processing.
312 The temporary structures, shown in
313 figure~\ref{constantpoolstructures}, are used to \textit{forward} the
314 data from the first pass into the second.
318 /* CONSTANT_Class entries */
319 typedef struct forward_class {
320 struct forward_class *next;
325 /* CONSTANT_String */
326 typedef struct forward_string {
327 struct forward_string *next;
332 /* CONSTANT_NameAndType */
333 typedef struct forward_nameandtype {
334 struct forward_nameandtype *next;
338 } forward_nameandtype;
340 /* CONSTANT_Fieldref, CONSTANT_Methodref or CONSTANT_InterfaceMethodref */
341 typedef struct forward_fieldmethint {
342 struct forward_fieldmethint *next;
346 u2 nameandtype_index;
347 } forward_fieldmethint;
349 \caption{temporary constant pool structures}
350 \label{constantpoolstructures}
353 The \texttt{classinfo} structure has two pointers to arrays which
354 contain the class' constant pool infos, namely: \texttt{cptags} and
355 \texttt{cpinfos}. \texttt{cptags} contains the type of the constant
356 pool entry. \texttt{cpinfos} contains a pointer to the constant pool
357 entry itself. In the second pass the references are resolved and the
358 runtime structures are created. In further detail this includes for
361 \item \texttt{CONSTANT\_Class}: get the UTF8 name string of the
362 class, store type \texttt{CONSTANT\_Class} in \texttt{cptags}, create
363 a class in the class hashtable with the UTF8 name and store the
364 pointer to the new class in \texttt{cpinfos}
366 \item \texttt{CONSTANT\_String}: get the UTF8 string of the
367 referenced string, store type \texttt{CONSTANT\_String} in
368 \texttt{cptags} and store the UTF8 string pointer into
373 \item \texttt{CONSTANT\_NameAndType}: create a
374 \texttt{constant\_nameandtype} (see
375 figure~\ref{constantnameandtype}) structure, get the UTF8 name and
376 description string of the field or method and store them into the
377 \texttt{constant\_nameandtype} structure, store type
378 \texttt{CONSTANT\_NameAndType} into \texttt{cptags} and store a
379 pointer to the \texttt{constant\_nameandtype} structure into
386 typedef struct { /* NameAndType (Field or Method) */
387 utf *name; /* field/method name */
388 utf *descriptor; /* field/method type descriptor string */
389 } constant_nameandtype;
391 \caption{\texttt{constant\_nameandtype} structure}
392 \label{constantnameandtype}
397 \item \texttt{CONSTANT\_Fieldref}, \texttt{CONSTANT\_Methodref} and
398 \texttt{CONSTANT\_InterfaceMethodref}: create a
399 \texttt{constant\_FMIref} (see figure~\ref{constantFMIref})
400 structure, get the referenced \texttt{constant\_nameandtype}
401 structure which contains the name and descriptor resolved in a
402 previous step and store the name and descriptor into the
403 \texttt{constant\_FMIref} structure, get the pointer of the
404 referenced class, which was created in a previous step, and store the
405 pointer of the class into the \texttt{constant\_FMIref} structure,
406 store the type of the current constant pool entry in \texttt{cptags}
407 and store a pointer to \texttt{constant\_FMIref} in \texttt{cpinfos}
413 typedef struct { /* Fieldref, Methodref and InterfaceMethodref */
414 classinfo *class; /* class containing this field/method/interface */
415 utf *name; /* field/method/interface name */
416 utf *descriptor; /* field/method/interface type descriptor string */
419 \caption{\texttt{constant\_FMIref} structure}
420 \label{constantFMIref}
425 Any UTF8 strings, \texttt{constant\_nameandtype} structures or
426 referenced classes are resolved with the
429 voidptr class_getconstant(classinfo *c, u4 pos, u4 ctype);
432 function. This functions checks for type equality and then returns the
433 requested \texttt{cpinfos} slot of the specified class.
436 \subsection{Interface loading}
438 Interface loading is very simple and straightforward. After reading
439 the number of interfaces, for every interface referenced, a
440 \texttt{u2} constant pool index is read from the currently loading
441 class or interface. This index is used to resolve the interface class
442 via the \texttt{class\_getconstant} function from the class' constant
443 pool. This means, interface \textit{loading} is more interface
444 \textit{resolving} than loading. The resolved interfaces are stored
445 in an \texttt{classinfo *} array allocated by the class loader. The
446 memory pointer of the array is assigned to the \texttt{interfaces}
447 field of the \texttt{clasinfo} structure.
450 \subsection{Field loading}
452 The number of fields of the class or interface is read as \texttt{u2}
453 value. For each field the function
456 static bool field_load(classbuffer *cb, classinfo *c, fieldinfo *f);
459 is called. The \texttt{fieldinfo *} argument is a pointer to a
460 \texttt{fieldinfo} structure (see figure~\ref{fieldinfostructure})
461 allocated by the class loader. The fields' \texttt{name} and
462 \texttt{descriptor} are resolved from the class constant pool via
463 \texttt{class\_getconstant}. If the verifier option is turned on, the
464 fields' \texttt{flags}, \texttt{name} and \texttt{descriptor} are
465 checked for validity and can result in a
466 \texttt{java.lang.ClassFormatError}.
470 struct fieldinfo { /* field of a class */
471 s4 flags; /* ACC flags */
472 s4 type; /* basic data type */
473 utf *name; /* name of field */
474 utf *descriptor; /* JavaVM descriptor string of field */
476 s4 offset; /* offset from start of object (instance variables) */
478 imm_union value; /* storage for static values (class variables) */
480 classinfo *class; /* needed by typechecker. Could be optimized */
481 /* away by using constant_FMIref instead of */
482 /* fieldinfo throughout the compiler. */
486 \caption{\texttt{fieldinfo} structure}
487 \label{fieldinfostructure}
490 Each field can have some attributes. The number of attributes is read
491 as \texttt{u2} value from the binary representation. If the field has
492 the \texttt{ACC\_FINAL} bit set in the flags, the
493 \texttt{ConstantValue} attribute is available. This is the only
494 attribute processed by \texttt{field\_load} and can occur only once,
495 otherwise a \texttt{java.lang.ClassFormatError} is thrown. The
496 \texttt{ConstantValue} entry in the constant pool contains the value
497 for the \texttt{final} field. Depending on the fields' type, the
498 proper constant pool entry is resolved and assigned.
501 \subsection{Method loading}
503 As for the fields, the number of the class or interface methods is read from
504 the binary representation as \texttt{u2} value. For each method the function
507 static bool method_load(classbuffer *cb, classinfo *c, methodinfo *m);
510 is called. The beginning of the method loading code is nearly the same
511 as the field loading code. The \texttt{methodinfo *} argument is a
512 pointer to a \texttt{methodinfo} structure allocated by the class
513 loader. The method's \texttt{name} and \texttt{descriptor} are
514 resolved from the class constant pool via
515 \texttt{class\_getconstant}. With the verifier turned on, some method
516 checks are carried out. These include \texttt{flags}, \texttt{name}
517 and \texttt{descriptor} checks and argument count check.
521 struct methodinfo { /* method structure */
522 java_objectheader header; /* we need this in jit's monitorenter */
523 s4 flags; /* ACC flags */
524 utf *name; /* name of method */
525 utf *descriptor; /* JavaVM descriptor string of method */
527 bool isleafmethod; /* does method call subroutines */
529 classinfo *class; /* class, the method belongs to */
530 s4 vftblindex; /* index of method in virtual function */
531 /* table (if it is a virtual method) */
532 s4 maxstack; /* maximum stack depth of method */
533 s4 maxlocals; /* maximum number of local variables */
534 s4 jcodelength; /* length of JavaVM code */
535 u1 *jcode; /* pointer to JavaVM code */
537 s4 exceptiontablelength;/* exceptiontable length */
538 exceptiontable *exceptiontable; /* the exceptiontable */
540 u2 thrownexceptionscount;/* number of exceptions attribute */
541 classinfo **thrownexceptions; /* checked exceptions a method may throw */
543 u2 linenumbercount; /* number of linenumber attributes */
544 lineinfo *linenumbers; /* array of lineinfo items */
546 u1 *stubroutine; /* stub for compiling or calling natives */
550 \caption{\texttt{methodinfo} structure}
551 \label{methodinfostructure}
554 The method loading function has to distinguish between a
555 \texttt{native} and a ''normal'' JAVA method. Depending on the
556 \texttt{ACC\_NATIVE} flags, a different stub is created.
558 For a JAVA method, a \textit{compiler stub} is created. The purpose of
559 this stub is to call the CACAO jit compiler with a pointer to the byte
560 code of the JAVA method as argument to compile the method into machine
561 code. During code generation a pointer to this compiler stub routine
562 is used as a temporary method call, if the method is not compiled
563 yet. After the target method is compiled, the new entry point of the
564 method is patched into the generated code and the compiler stub is
565 needless, thus it is freed.
567 If the method is a \texttt{native} method, the loader tries to find
568 the native function. If the function was found, a \textit{native stub}
569 is generated. This stub is responsible to manipulate the method's
570 arguments to be suitable for the \texttt{native} method called. This
571 includes inserting the \textit{JNI environment} pointer as first
572 argument and, if the \texttt{native} method has the
573 \texttt{ACC\_STATIC} flag set, inserting a pointer to the methods
574 class as second argument. If the \texttt{native} method is
575 \texttt{static}, the native stub also checks if the method's class is
576 already initialized. If the method's class is not initialized as the
577 native stub is generated, a \texttt{asm\_check\_clinit} calling code
580 Each method can have some attributes. The method loading function
581 processes two of them: \texttt{Code} and \texttt{Exceptions}.
583 The \texttt{Code} attribute is a \textit{variable-length} attribute
584 which contains the Java Virtual Machine instructions---the byte
585 code---of the JAVA method. If the method is either \texttt{native} or
586 \texttt{abstract}, it must not have a \texttt{Code} attribute,
587 otherwise it must have exactly one \texttt{Code}
588 attribute. Additionally to the byte code, the \texttt{Code} attribute
589 contains the exception table and attributes to \texttt{Code} attribute
590 itself. One exception table entry contains the \texttt{start\_pc},
592 \texttt{handler\_pc} of the \texttt{try-catch} block, each read as
593 \texttt{u2} value, plus a reference to the class of the
594 \texttt{catch\_type}. Currently there are two attributes of the
595 \texttt{Code} attribute defined in the JVM specification:
596 \texttt{LineNumberTable} and \texttt{LocalVariableTable}. CACAO only
597 processes the \texttt{LineNumberTable} attribute. A
598 \texttt{LineNumberTable} entry consist of the \texttt{start\_pc} and
599 the \texttt{line\_number}, which are stored in a \texttt{lineinfo}
600 structure (see figure~\ref{lineinfostructure}).
609 \caption{\texttt{lineinfo} structure}
610 \label{lineinfostructure}
613 The linenumber count and the memory pointer of the \texttt{lineinfo}
614 structure array are assigned to the \texttt{classinfo} fields
615 \texttt{linenumbercount} and \texttt{linenumbers} respectively.
617 The \texttt{Exceptions} attribute is a \textit{variable-length}
618 attribute and contains the checked exceptions the JAVA method may
619 throw. The \texttt{Exceptions} attribute consist of the count of
620 exceptions, which is stored in the \texttt{classinfo} field
621 \texttt{thrownexceptionscount}, and the adequate amount of \texttt{u2}
622 constant pool index values. The exception classes are resolved from
623 the constant pool and stored in an allocated \texttt{classinfo *}
624 array, whose memory pointer is assigned to the
625 \texttt{thrownexceptions} field of the \texttt{classinfo} structure.
627 Any attributes which are not processed by the CACAO class loading
628 system, are skipped via
631 static bool skipattributebody(classbuffer *cb);
634 which skips one attribute or
637 static bool skipattributes(classbuffer *cb, u4 num);
640 which skips a specified number \texttt{num} of attributes. If any
641 problem occurs in the method loading function, a
642 \texttt{java.lang.ClassFormatError} with a specific detail message is
646 \subsection{Attribute loading}
648 Attribute loading is done via the
651 static bool attribute_load(classbuffer *cb, classinfo *c, u4 num);
654 function. The currently loading class or interface can contain some
655 additional attributes which have not already been loaded. The CACAO
656 system class loader processes two of them: \texttt{InnerClasses} and
659 The \texttt{InnerClass} attribute is a \textit{variable-length}
660 attribute in the \texttt{attributes} table of the binary
661 representation of the class or interface. A \texttt{InnerClass} entry
662 contains the \texttt{inner\_class} constant pool index itself, the
663 \texttt{outer\_class} index, the \texttt{name} index of the inner
664 class' name and the inner class' \texttt{flags} bitmask. All these
665 values are read in \texttt{u2} chunks.
667 The constant pool indexes are used with the
670 voidptr innerclass_getconstant(classinfo *c, u4 pos, u4 ctype);
673 function call to resolve the classes or UTF8 strings. After resolving
674 is done, all values are stored in the \texttt{innerclassinfo}
675 structure (see figure~\ref{innerclassinfostructure}).
679 struct innerclassinfo {
680 classinfo *inner_class; /* inner class pointer */
681 classinfo *outer_class; /* outer class pointer */
682 utf *name; /* innerclass name */
683 s4 flags; /* ACC flags */
686 \caption{\texttt{innerclassinfo} structure}
687 \label{innerclassinfostructure}
690 The other attribute, \texttt{SourceFile}, is just one \texttt{u2}
691 constant pool index value to get the UTF8 string reference of the
692 class' \texttt{SourceFile} name. The string pointer is assigned to the
693 \texttt{sourcefile} field of the \texttt{classinfo} structure.
695 Both attributes must occur only once. Other attributes than these two
696 are skipped with the earlier mentioned \texttt{skipattributebody}
699 After the attribute loading is done and no error occured, the
700 \texttt{class\_load\_intern} function returns the \texttt{classinfo}
701 pointer to signal that there was no problem. If \texttt{NULL} is
702 returned, there was an exception.
705 \section{Dynamic class loader}
708 \section{Eager - lazy class loading}
710 A Java Virtual Machine can implement two different algorithms for the
711 system class loader to load classes or interfaces: \textit{eager class
712 loading} and \textit{lazy class loading}.
715 \subsection{Eager class loading}
716 \label{sectioneagerclassloading}
718 The Java Virtual Machine initially creates, loads and links the class
719 of the main program with the system class loader. The creation of the
720 class is done via the \texttt{class\_new} function call (see section
721 \ref{sectionsystemclassloader}). In this function, with \textit{eager
722 loading} enabled, firstly the currently created class or interface is
723 loaded with \texttt{class\_load}. CACAO uses the \textit{eager class
724 loading} algorithm with the command line switch \texttt{-eager}. As
725 described in the ''Constant pool loading'' section (see
726 \ref{sectionconstantpoolloading}), the binary representation of a
727 class or interface contains references to other classes or
728 interfaces. With \textit{eager loading} enabled, referenced classes or
729 interfaces are loaded immediately.
731 If a class reference is found in the second pass of the constant pool
732 loading process, the class is created in the class hashtable with
733 \texttt{class\_new\_intern}. CACAO uses the intern function here
734 because the normal \texttt{class\_new} function, which is a wrapper
735 function, instantly tries to \textit{link} all referenced
736 classes. This must not happen until all classes or interfaces
737 referenced are loaded, otherwise the Java Virtual Machine gets into an
740 After the \texttt{classinfo} of the class referenced is created, the
741 class or interface is \textit{loaded} via the \texttt{class\_load}
742 function (described in more detail in section
743 \ref{sectionsystemclassloader}). When the class loading function
744 returns, the current referenced class or interface is added to a list
745 called \texttt{unlinkedclasses}, which contains all loaded but
746 unlinked classes referenced by the currently loaded class or
747 interface. This list is processed in the \texttt{class\_new} function
748 of the currently created class or interface after \texttt{class\_load}
749 returns. For each entry in the \texttt{unlinkedclasses} list,
750 \texttt{class\_link} is called which finally \textit{links} the class
751 (described in more detail in section \ref{sectionlinking}) and then
752 the class entry is removed from the list. When all referenced classes
753 or interfaces are linked, the currently created class or interface is
754 linked and the \texttt{class\_new} functions returns.
757 \subsection{Lazy class loading}
758 \label{sectionlazyclassloading}
760 With \textit{eager class loading}, usually it takes much more time for
761 a Java Virtual Machine to start a program as with \textit{lazy class
762 loading}. With \textit{eager class loading}, a typical
763 \texttt{HelloWorld} program needs 513 class loads with the current GNU
764 classpath CACAO is using. When using \textit{lazy class loading},
765 CACAO only needs 121 class loads for the same \texttt{HelloWorld}
766 program. This means with \textit{lazy class loading} CACAO needs to
767 load more than four times less class files. Furthermore CACAO does
768 also \textit{lazy class linking}, which saves much more run-time here.
770 CACAO's \textit{lazy class loading} implementation does not completely
771 follow the JVM specification. A Java Virtual Machine which implements
772 \textit{lazy class loading} should load and link requested classes or
773 interfaces at runtime. But CACAO does class loading and linking at
774 parse time, because of some problems not resolved yet. That means, if
775 a Java Virtual Machine instruction is parsed which uses any class or
776 interface references, like \texttt{JAVA\_PUTSTATIC},
777 \texttt{JAVA\_GETFIELD} or any \texttt{JAVA\_INVOKE*} instructions,
778 the referenced class or interface is loaded and linked immediately
779 during the parse pass of currently compiled method. This introduces
780 some incompatibilities with other Java Virtual Machines like Sun's
781 JVM, IBM's JVM or Kaffe.
783 Given a code snippet like this
786 void sub(boolean b) {
790 System.out.println("foobar");
794 If the function is called with \texttt{b} equal \texttt{false} and the
795 class file \texttt{A.class} does not exist, a Java Virtual Machine
796 should execute the code without any problems, print \texttt{foobar}
797 and exit the Java Virtual Machine with exit code 0. Due to the fact
798 that CACAO does class loading and linking at parse time, the CACAO
799 Virtual Machine throws an \texttt{java.lang.NoClassDefFoundError:~A}
800 exception which is not caught and thus discontinues the execution
801 without printing \texttt{foobar} and exits.
803 The CACAO development team has not yet a solution for this
804 problem. It's not trivial to move the loading and linking process from
805 the compilation phase into runtime, especially CACAO was initially
806 designed for \textit{eager class loading} and \textit{lazy class
807 loading} was implemented at a later time to optimize class loading and
808 to get a little closer to the JVM specification. \textit{Lazy class
809 loading} at runtime is one of the most important features to be
810 implemented in the future. It is essential to make CACAO a standard
811 compliant Java Virtual Machine.
815 \label{sectionlinking}
817 Linking is the process of preparing a previously loaded class or
818 interface to be used in the Java Virtual Machine's runtime
819 environment. The function which performs the linking in CACAO is
822 classinfo *class_link(classinfo *c);
825 This function, as for class loading, is just a wrapper function to the
826 main linking function
829 static classinfo *class_link_intern(classinfo *c);
832 This function should not be called directly and is thus declared as
833 \texttt{static}. The purposes of the wrapper function are
836 \item enter a monitor on the \texttt{classinfo} structure, so that
837 only one thread can link the same class or interface at the same time
839 \item check if the class or interface is \texttt{linked}, if it is
840 \texttt{true}, leave the monitor and return immediately
842 \item measure linking time if requested
844 \item check if the intern linking function has thrown an error or an
845 exception and reset the \texttt{linked} field of the
846 \texttt{classinfo} structure
848 \item leave the monitor
851 The \texttt{class\_link} function, like the \texttt{class\_load}
852 function, is implemented to be \textit{reentrant}. This must be the
853 case for the linking algorithm implemented in CACAO. Furthermore this
854 means that serveral threads can link different classes or interfaces
855 at the same time on multiprocessor machines.
857 The first step in the \texttt{class\_link\_intern} function is to set
858 the \texttt{linked} field of the currently linked \texttt{classinfo}
859 structure to \texttt{true}. This is essential, that the linker does
860 not try to link a class or interface again, while it's already in the
861 linking process. Such a case can occur because the linker also
862 processes the class' direct superclass and direct superinterfaces.
864 In CACAO's linker the direct superinterfaces are processed first. For
865 each interface in the \texttt{interfaces} field of the
866 \texttt{classinfo} structure is checked if there occured an
867 \texttt{java.lang.ClassCircularityError}, which happens when the
868 currently linked class or interface is equal the interface which
869 should be processed. Otherwise the interface is loaded and linked if
870 not already done. After the interface is loaded successfully, the
871 interface flags are checked for the \texttt{ACC\_INTERFACE} bit. If
872 this is not the case, a
873 \texttt{java.lang.IncompatibleClassChangeError} is thrown and
874 \texttt{class\_link\_intern} returns.
876 Then the direct superclass is handled. If the direct superclass is
877 equal \texttt{NULL}, we have the special case of linking
878 \texttt{java.lang.Object}. There are only set some \texttt{classinfo}
879 fields to special values for \texttt{java.lang.Object} like
883 c->instancesize = sizeof(java_objectheader);
888 If the direct superclass is non-\texttt{NULL}, CACAO firstly detects
889 class circularity as for interfaces. If no
890 \texttt{java.lang.ClassCircularityError} was thrown, the superclass is
891 loaded and linked if not already done before. Then some flag bits of
892 the superclass are checked: \texttt{ACC\_INTERFACE} and
893 \texttt{ACC\_FINAL}. If one of these bits is set an error is thrown.
895 If the currently linked class is an array, CACAO calls a special array
899 static arraydescriptor *class_link_array(classinfo *c);
902 This function firstly checks if the passed \texttt{classinfo} is an
903 \textit{array of arrays} or an \textit{array of objects}. In both
904 cases the component type is created in the class hashtable via
905 \texttt{class\_new} and then loaded and linked if not already
906 done. If none is the case, the passed array is a \textit{primitive
907 type array}. No matter of which type the array is, an
908 \texttt{arraydescriptor} structure (see
909 figure~\ref{arraydescriptorstructure}) is allocated and filled with
910 the appropriate values of the given array type.
914 struct arraydescriptor {
915 vftbl_t *componentvftbl; /* vftbl of the component type, NULL for primit. */
916 vftbl_t *elementvftbl; /* vftbl of the element type, NULL for primitive */
917 s2 arraytype; /* ARRAYTYPE_* constant */
918 s2 dimension; /* dimension of the array (always >= 1) */
919 s4 dataoffset; /* offset of the array data from object pointer */
920 s4 componentsize; /* size of a component in bytes */
921 s2 elementtype; /* ARRAYTYPE_* constant */
924 \caption{\texttt{arraydescriptor} structure}
925 \label{arraydescriptorstructure}
928 After the \texttt{class\_link\_array} function call, the class
929 \texttt{index} is calculated. For interfaces---classes with
930 \texttt{ACC\_INTERFACE} flag bit set---the class' \texttt{index} is
931 the global \texttt{interfaceindex} plus one. Any other classes get the
932 \texttt{index} of the superclass plus one.
934 Other \texttt{classinfo} fields are also set from the superclass like,
935 \texttt{instancesize}, \texttt{vftbllength} and the \texttt{finalizer}
936 function. All these values are temporary ones and can be overwritten
939 The next step in \texttt{class\_load\_intern} is to compute the
940 \textit{virtual function table length}. For each method in
941 \texttt{classinfo}'s \texttt{methods} field which has not the
942 \texttt{ACC\_STATIC} flag bit set, thus is an instance method, the
943 direct superclasses up to \texttt{java.lang.Object} are checked with
946 static bool method_canoverwrite(methodinfo *m, methodinfo *old);
949 if the current method can overwrite the superclass method, if there
950 exists one. If the found superclass method has the
951 \texttt{ACC\_PRIVATE} flag bit set, the current method's
952 \textit{virtual function table index} is the current \textit{virtual
953 function table length} plus one:
956 m->vftblindex = (vftbllength++);
959 If the current method has the \texttt{ACC\_FINAL} flag bit set, the
960 CACAO class linker throws a \texttt{java.lang.VerifyError}. Otherwise
961 the current method's \textit{virtual function table index} is the same
962 as the index from the superclass method:
965 m->vftblindex = tc->methods[j].vftblindex;
968 After processing the \textit{virtual function table length}, the CACAO
969 linker computes the \textit{interface table length}. For the current
970 class' and every superclass' interfaces, the function
973 static s4 class_highestinterface(classinfo *c);
976 is called. This function computes the highest interface \texttt{index}
977 of the passed interface and returns the value. This is done by
978 recursively calling \texttt{class\_highestinterface} with each
979 interface from the passed interface. The highest \texttt{index} value
980 found is the \textit{interface table length} of the currently linking
983 Now that the linker has completely computed the size of the
984 \textit{virtual function table}, the memory can be allocated, casted
985 to an \texttt{vftbl} structure (see figure~\ref{vftblstructure}) and
986 filled with the previously calculated values.
991 methodptr *interfacetable[1]; /* interface table (access via macro) */
993 classinfo *class; /* class, the vtbl belongs to */
995 arraydescriptor *arraydesc; /* for array classes, otherwise NULL */
997 s4 vftbllength; /* virtual function table length */
998 s4 interfacetablelength; /* interface table length */
1000 s4 baseval; /* base for runtime type check */
1001 /* (-index for interfaces) */
1002 s4 diffval; /* high - base for runtime type check */
1004 s4 *interfacevftbllength; /* length of interface vftbls */
1006 methodptr table[1]; /* class vftbl */
1009 \caption{\texttt{vftbl} structure}
1010 \label{vftblstructure}
1013 Some important values are
1016 c->header.vftbl = c->vftbl = v;
1018 v->vftbllength = vftbllength;
1019 v->interfacetablelength = interfacetablelength;
1020 v->arraydesc = arraydesc;
1023 If the currently linked class is an interface, the \texttt{baseval} of
1024 the interface's \textit{virtual function table} is set to
1025 \texttt{-(c->index)}. Then the \textit{virtual function table} of the
1026 direct superclass is copied into the \texttt{table} field of the
1027 current \textit{virtual function table} and for each
1028 non-\texttt{static} method in the current's class or interface
1029 \texttt{methods} field, the pointer to the \textit{stubroutine} of the
1030 method in stored in the \textit{virtual function table}.
1032 Now the fields of the currently linked class or interface are
1033 processed. The CACAO linker computes the instance size of the class or
1034 interface and the offset of each field inside. For each field in the
1035 \texttt{classinfo} field \texttt{fields} which is non-\texttt{static},
1036 the type-size is resolved via the \texttt{desc\_typesize} function
1037 call. Then a new \texttt{instancesize} is calculated with
1040 c->instancesize = ALIGN(c->instancesize, dsize);
1043 which does memory alignment suitable for the next field. This newly
1044 computed \texttt{instancesize} is the \texttt{offset} of the currently
1045 processed field. The type-size is then added to get the real
1046 \texttt{instancesize}.
1048 The next step of the CACAO linker is to initialize two \textit{virtual
1049 function table} fields, namely \texttt{interfacevftbllength} and
1050 \texttt{interfacetable}. For \texttt{interfacevftbllength} an
1051 \texttt{s4} array of \texttt{interfacetablelength} elements is
1052 allocated. Each \texttt{interfacevftbllength} element is initialized
1053 with \texttt{0} and the elements in \texttt{interfacetable} with
1054 \texttt{NULL}. After the initialization is done, the interfaces of the
1055 currently linked class and all it's superclasses, up to
1056 \texttt{java.lang.Object}, are processed via the
1059 static void class_addinterface(classinfo *c, classinfo *ic);
1062 function call. This function adds the methods of the passed interface
1063 to the \textit{virtual function table} of the passed class or
1064 interface. If the method count of the passed interface is zero, the
1065 function adds a method fake entry, which is needed for subtype
1069 v->interfacevftbllength[i] = 1;
1070 v->interfacetable[-i] = MNEW(methodptr, 1);
1071 v->interfacetable[-i][0] = NULL;
1074 \texttt{i} represents the \texttt{index} of the passed interface
1075 \texttt{ic}, \texttt{v} the \textit{virtual function table} of the
1076 passed class or interface \texttt{c}.
1078 If the method count is non-zero, an \texttt{methodptr} array of
1079 \texttt{ic->methodscount} elements is allocated and the method count
1080 value is stored in the particular position of the
1081 \texttt{interfacevftbllength} array:
1084 v->interfacevftbllength[i] = ic->methodscount;
1085 v->interfacetable[-i] = MNEW(methodptr, ic->methodscount);
1088 For each method of the passed interface, the methods of the passed
1089 target class or interface and all superclass methods, up to
1090 \texttt{java.lang.Object}, are checked if they can overwrite the
1091 interface method via \texttt{method\_canoverwrite}. If the function
1092 returns \texttt{true}, the corresponding function is resolved from the
1093 \texttt{table} field of the \textit{virtual function table} and stored
1094 it the particular position of the \texttt{interfacetable}:
1097 v->interfacetable[-i][j] = v->table[mi->vftblindex];
1100 The \texttt{class\_addinterface} function is also called recursively
1101 for all interfaces the interface passed implements.
1103 After the interfaces were added and the currently linked class or
1104 interface is not \texttt{java.lang.Object}, the CACAO linker tries to
1105 find a function which name and descriptor matches
1106 \texttt{finalize()V}. If an appropriate function was found and the
1107 function is non-\texttt{static}, it is assigned to the
1108 \texttt{finalizer} field of the \texttt{classinfo} structure. CACAO
1109 does not assign the \texttt{finalize()V} function to
1110 \texttt{java.lang.Object}, because this function is inherited to all
1111 subclasses which do not explicitly implement a \texttt{finalize()V}
1112 method. This would mean, for each instantiated object, which is marked
1113 for collection in the Java Virtual Machine, an empty function would be
1114 called from the garbage collector when a garbage collection takes
1117 The final task of the linker is to compute the \texttt{baseval} and
1118 \texttt{diffval} values from the subclasses of the currently linked
1119 class or interface. These values are used for \textit{runtime type
1120 checking} (described in more detail in
1121 section~\ref{sectionruntimetypechecking}). The calculation is done via
1125 void loader_compute_subclasses(classinfo *c);
1128 function call. This function sets the \texttt{nextsub} and
1129 \texttt{sub} fields of the \texttt{classinfo} structure, resets the
1130 global \texttt{classvalue} variable to zero and calls the
1133 static void loader_compute_class_values(classinfo *c);
1136 function with \texttt{java.lang.Object} as parameter. First of the
1137 all, the \texttt{baseval} is set of the currently passed class or
1138 interface. The \texttt{baseval} is the global \texttt{classvalue}
1142 c->vftbl->baseval = ++classvalue;
1145 Then all subclasses of the currently passed class or interface are
1146 processed. For each subclass found,
1147 \texttt{loader\_compute\_class\_values} is recursively called. After
1148 all subclasses have been processed, the \texttt{diffval} of the
1149 currently passed class or interface is calculated. It is the
1150 difference of the current global \texttt{classvalue} variable value
1151 and the previously \texttt{baseval} set:
1154 c->vftbl->diffval = classvalue - c->vftbl->baseval;
1157 After the \texttt{baseval} and \texttt{diffval} values are newly
1158 calculated for all classes and interfaces in the Java Virtual Machine,
1159 the internal linker function \texttt{class\_link\_intern} returns the
1160 currently linking \texttt{classinfo} structure pointer, to indicate
1161 that the linker function did not raise an error or exception.
1164 \section{Initialization}
1165 \label{sectioninitialization}
1167 A class or interface can have a \texttt{static} initialization
1168 function called \textit{static class initializer}. The function has
1169 the name \texttt{<clinit>()V}. This function must be invoked before a
1170 \texttt{static} function of the class is called or a \texttt{static}
1171 field is accessed via \texttt{ICMD\_PUTSTATIC} or
1172 \texttt{ICMD\_GETSTATIC}. In CACAO
1175 classinfo *class_init(classinfo *c);
1178 is responsible for the invocation of the \textit{static class
1179 initializer}. It is, like for class loading and class linking, just a
1180 wrapper function to the main initializing function
1183 static classinfo *class_init_intern(classinfo *c);
1186 The wrapper function has the following purposes:
1189 \item enter a monitor on the \texttt{classinfo} structure, so that
1190 only one thread can initialize the same class or interface at the
1193 \item check if the class or interface is \texttt{initialized} or
1194 \texttt{initializing}, if one is \texttt{true}, leave the monitor and
1197 \item tag the class or interface as \texttt{initializing}
1199 \item call the internal initialization function
1200 \texttt{class\_init\_intern}
1202 \item if the internal initialization function returns
1203 non-\texttt{NULL}, the class or interface is tagged as
1204 \texttt{initialized}
1206 \item reset the \texttt{initializing} flag
1208 \item leave the monitor
1211 The intern initializing function should not be called directly,
1212 because of race conditions of concurrent threads. Two or more
1213 different threads could access a \texttt{static} field or call a
1214 \texttt{static} function of an uninitialized class at almost the same
1215 time. This means that each single thread would invoke the
1216 \textit{static class initializer} and this would lead into some
1219 The CACAO initializer needs to tag the class or interface as currently
1220 initializing. This is done by setting the \texttt{initializing} field
1221 of the \texttt{classinfo} structure to \texttt{true}. CACAO needs this
1222 field in addition to the \texttt{initialized} field for two reasons:
1225 \item Another concurrently running thread can access a
1226 \texttt{static} field of the currently initializing class or
1227 interface. If the class or interface of the \texttt{static} field was
1228 not initialized during code generation, some special code was
1229 generated for the \texttt{ICMD\_PUTSTATIC} and
1230 \texttt{ICMD\_GETSTATIC} intermediate commands. This special code is
1231 a call to an architecture dependent assembler function named
1232 \texttt{asm\_check\_clinit}. Since this function is speed optimized
1233 for the case that the target class is already initialized, it only
1234 checks for the \texttt{initialized} field and does not take care of
1235 any monitor that may have been entered. If the \texttt{initialized}
1236 flag is \texttt{false}, the assembler function calls the
1237 \texttt{class\_init} function where it probably stops at the monitor
1238 enter. Due to this fact, the thread which does the initialization can
1239 not set the \texttt{initialized} flag to \texttt{true} when the
1240 initialization starts, otherwise potential concurrently running
1241 threads would continue their execution although the \textit{static
1242 class initializer} has not finished yet.
1244 \item The thread which is currently \texttt{initializing} the class
1245 or interface can pass the monitor which has been entered and thus
1246 needs to know if this class or interface is currently initialized.
1249 Firstly \texttt{class\_init\_intern} checks if the passed class or
1250 interface is loaded and linked. If not, the particular action is
1251 taken. This is just a safety measure, because---CACAO
1252 internally---each class or interface should have been already loaded
1253 and linked before \texttt{class\_init} is called.
1255 Then the superclass, if any specified, is checked if it is already
1256 initialized. If not, the initialization is done immediately. The same
1257 check is performed for each interface in the \texttt{interfaces} array
1258 of the \texttt{classinfo} structure of the current class or interface.
1260 After the superclass and all interfaces are initialized, CACAO tries
1261 to find the \textit{static class initializer} function, where the
1262 method name matches \texttt{<clinit>} and the method descriptor
1263 \texttt{()V}. If no \textit{static class initializer} method is found in the
1264 current class or interface, the \texttt{class\_link\_intern} functions
1265 returns immediately without an error. If a \textit{static class
1266 initializer} method is found, it's called with the architecture
1267 dependent assembler function \texttt{asm\_calljavafunction}.
1269 Exception handling of an exception thrown in an \textit{static class
1270 initializer} is a bit different than usual. It depends on the type of
1271 exception. If the exception thrown is an instance of
1272 \texttt{java.lang.Error}, the \texttt{class\_init\_intern} function
1273 just returns \texttt{NULL}. If the exception thrown is an instance of
1274 \texttt{java.lang.Exception}, the exception is wrapped into a
1275 \texttt{java.lang.ExceptionInInitializerError}. This is done via the
1276 \texttt{new\_exception\_throwable} function call. The newly generated
1277 error is set as exception thrown and the \texttt{class\_init\_intern}
1278 returns \texttt{NULL}.
1280 If no exception occurred in the \textit{static class initializer}, the
1281 internal initializing function returns the current \texttt{classinfo}
1282 structure pointer to indicate, that the initialization was successful.