6 A \textit{Java Virtual Machine} (JVM) dynamically loads, links and
7 initializes classes and interfaces when they are needed. Loading a
8 class or interface means locating the binary representation---the
9 class files---and creating a class of interface structure from that
10 binary representation. Linking takes a loaded class or interface and
11 transfers it into the runtime state of the \textit{Java Virtual
12 Machine} so that it can be executed. Initialization of a class or
13 interface means executing the static class of interface initializer
16 The following sections describe the process of loading, linking and
17 initalizing a class or interface in the CACAO \textit{Java Virtual
18 Machine} in greater detail. Further the used data structures and
19 techniques used in CACAO and the interaction with the GNU classpath
23 \section{System class loader}
24 \label{sectionsystemclassloader}
26 The class loader of a \textit{Java Virtual Machine} (JVM) is
27 responsible for loading all type of classes and interfaces into the
28 runtime system of the JVM. Every JVM has a \textit{system class
29 loader} which is implemented in \texttt{java.lang.ClassLoader} and
30 this class interacts via native function calls with the JVM itself.
34 The \textit{GNU classpath} implements the system class loader in
35 \texttt{gnu.java.lang.SystemClassLoader} which extends
36 \texttt{java.lang.ClassLoader} and interacts with the JVM. The
37 \textit{bootstrap class loader} is implemented in
38 \texttt{java.lang.ClassLoader} plus the JVM depended class
39 \texttt{java.lang.VMClassLoader}. \texttt{java.lang.VMClassLoader} is
40 the main class how the bootstrap class loader of the GNU classpath
41 interacts with the JVM. The main functions of this class is
46 static final native Class loadClass(String name, boolean resolve)
47 throws ClassNotFoundException;
52 This is a native function implemented in the CACAO JVM, which is
53 located in \texttt{nat/VMClassLoader.c} and calls the internal loader
54 functions of CACAO. If the \texttt{name} argument is \texttt{NULL}, a
55 new \texttt{java.lang.NullPointerException} is created and the
56 function returns \texttt{NULL}.
60 If the \texttt{name} is non-NULL a new UTF8 string of the class' name
61 is created in the internal \textit{symbol table} via
64 utf *javastring_toutf(java_lang_String *string, bool isclassname);
67 This function converts a \texttt{java.lang.String} string into the
68 internal used UTF8 string representation. \texttt{isclassname} tells
69 the function to convert any \texttt{.} (periods) found in the class
70 name into \texttt{/} (slashes), so the class loader can find the
73 Then a new \texttt{classinfo} structure is created via the
76 classinfo *class_new(utf *classname);
79 function call. This function creates a unique representation of this
80 class, identified by its name, in the JVM's internal \textit{class
81 hashtable}. The newly created \texttt{classinfo} structure (Figure
82 \ref{classinfostructure}) is initialized with correct values, like
83 \texttt{loaded = false;}, \texttt{linked = false;} and
84 \texttt{initialized = false;}. This guarantees a definite state of a
89 struct classinfo { /* class structure */
91 s4 flags; /* ACC flags */
92 utf *name; /* class name */
94 s4 cpcount; /* number of entries in constant pool */
95 u1 *cptags; /* constant pool tags */
96 voidptr *cpinfos; /* pointer to constant pool info structures */
98 classinfo *super; /* super class pointer */
100 s4 interfacescount; /* number of interfaces */
101 classinfo **interfaces; /* pointer to interfaces */
103 s4 fieldscount; /* number of fields */
104 fieldinfo *fields; /* field table */
106 s4 methodscount; /* number of methods */
107 methodinfo *methods; /* method table */
109 bool initialized; /* true, if class already initialized */
110 bool initializing; /* flag for the compiler */
111 bool loaded; /* true, if class already loaded */
112 bool linked; /* true, if class already linked */
113 s4 index; /* hierarchy depth (classes) or index */
115 s4 instancesize; /* size of an instance of this class */
116 #ifdef SIZE_FROM_CLASSINFO
117 s4 alignedsize; /* size of an instance, aligned to the */
118 /* allocation size on the heap */
121 vftbl_t *vftbl; /* pointer to virtual function table */
123 methodinfo *finalizer; /* finalizer method */
125 u2 innerclasscount; /* number of inner classes */
126 innerclassinfo *innerclass;
128 utf *packagename; /* full name of the package */
129 utf *sourcefile; /* classfile name containing this class */
130 java_objectheader *classloader; /* NULL for bootstrap classloader */
133 \caption{\texttt{classinfo} structure}
134 \label{classinfostructure}
137 The next step is to actually load the class requested. Thus the main
141 classinfo *class_load(classinfo *c);
144 is called, which is a wrapper function to the real loader function
147 classinfo *class_load_intern(classbuffer *cb);
150 This wrapper function is required to ensure some requirements:
153 \item enter a monitor on the \texttt{classinfo} structure, so that
154 only one thread can load the same class at the same time
156 \item measure the loading time if requested
158 \item initialize the \texttt{classbuffer} structure with the actual
161 \item reset the \texttt{loaded} field of the \texttt{classinfo}
162 structure to \texttt{false} amd remove the \texttt{classinfo}
163 structure from the internal class hashtable if we got an error or
164 exception during loading
166 \item free any allocated memory and leave the monitor
169 The \texttt{class\_load} function is implemented to be
170 \textit{reentrant}. This must be the case for the \textit{eager class
171 loading} algorithm implemented in CACAO (described in more detail in
172 section \ref{sectioneagerclassloading}). Furthermore this means that
173 serveral threads can load different classes or interfaces at the same
174 time on multiprocessor machines.
176 The \texttt{class\_load\_intern} functions preforms the actual loading
177 of the binary representation of the class or interface. During loading
178 some verifier checks are performed which can throw an error. This
179 error can be a \texttt{java.lang.ClassFormatError} or a
180 \texttt{java.lang.NoClassDefFoundError}. Some of these
181 \texttt{java.lang.ClassFormatError} checks are
184 \item \textit{Truncated class file} --- unexpected end of class file
187 \item \textit{Bad magic number} --- class file does not start with
188 the magic bytes (\texttt{0xCAFEBABE})
190 \item \textit{Unsupported major.minor version} --- the bytecode
191 version of the given class file is not supported by the JVM
194 The actual loading of the bytes from the binary representation is done
195 via the \texttt{suck\_*} functions. These functions are
198 \item \texttt{suck\_u1}: load one \texttt{unsigned byte} (8 bit)
200 \item \texttt{suck\_u2}: load two \texttt{unsigned byte}s (16 bit)
202 \item \texttt{suck\_u4}: load four \texttt{unsigned byte}s (32 bit)
204 \item \texttt{suck\_u8}: load eight \texttt{unsigned byte}s (64 bit)
206 \item \texttt{suck\_float}: load four \texttt{byte}s (32 bit)
207 converted into a \texttt{float} value
209 \item \texttt{suck\_double}: load eight \texttt{byte}s (64 bit)
210 converted into a \texttt{double} value
212 \item \texttt{suck\_nbytes}: load \textit{n} bytes
215 Loading \texttt{signed} values is done via the
216 \texttt{suck\_s[1,2,4,8]} macros which cast the loaded bytes to
217 \texttt{signed} values. All these functions take a
218 \texttt{classbuffer} (Figure \ref{classbufferstructure}) structure
223 typedef struct classbuffer {
224 classinfo *class; /* pointer to classinfo structure */
225 u1 *data; /* pointer to byte code */
226 s4 size; /* size of the byte code */
227 u1 *pos; /* current read position */
230 \caption{\texttt{classbuffer} structure}
231 \label{classbufferstructure}
234 This \texttt{classbuffer} structure is filled with data via the
237 classbuffer *suck_start(classinfo *c);
240 function. This function tries to locate the class, specifed with the
241 \texttt{classinfo} structure, in the \texttt{CLASSPATH}. This can be
242 a plain class file in the filesystem or a file in a
243 \texttt{zip}/\texttt{jar} file. If the class file is found, the
244 \texttt{classbuffer} is filled with data collected from the class
245 file, including the class file size and the binary representation of
248 Before reading any byte of the binary representation with a
249 \texttt{suck\_*} function, the remaining bytes in the
250 \texttt{classbuffer} data array must be checked with the
253 static inline bool check_classbuffer_size(classbuffer *cb, s4 len);
256 function. If the remaining bytes number is less than the amount of the
257 bytes to be read, specified by the \texttt{len} argument, a
258 \texttt{java.lang.ClassFormatError} with the detail message
259 \textit{Truncated class file}---as mentioned before---is thrown.
261 The following subsections describe chronologically in greater detail
262 the individual loading steps of a class or interface from it's binary
266 \subsection{Constant pool loading}
267 \label{sectionconstantpoolloading}
269 The class' constant pool is loaded via
272 static bool class_loadcpool(classbuffer *cb, classinfo *c);
275 from the \texttt{constant\_pool} table in the binary representation of
276 the class of interface. The constant pool needs to be parsed in two
277 passes. In the first pass the information loaded is saved in temporary
278 structures, which are further processed in the second pass, when the
279 complete constant pool has been traversed. Only when the whole
280 constant pool entries have been loaded, any constant pool entry can be
281 completely resolved, but this resolving can only be done in a specific
285 \item \texttt{CONSTANT\_Class}
287 \item \texttt{CONSTANT\_String}
289 \item \texttt{CONSTANT\_NameAndType}
291 \item \texttt{CONSTANT\_Fieldref}, \texttt{CONSTANT\_Methodref} and
292 \texttt{CONSTANT\_InterfaceMethodref} --- these are combined into one
298 The remaining constant pool types \texttt{CONSTANT\_Integer},
299 \texttt{CONSTANT\_Float}, \texttt{CONSTANT\_Long},
300 \texttt{CONSTANT\_Double} and \texttt{CONSTANT\_Utf8} can be
301 completely resolved in the first pass and need no further processing.
305 The temporary structures, shown in Figure
306 \ref{constantpoolstructures}, are used to \textit{forward} the data
307 from the first pass into the second.
311 /* CONSTANT_Class entries */
312 typedef struct forward_class {
313 struct forward_class *next;
318 /* CONSTANT_String */
319 typedef struct forward_string {
320 struct forward_string *next;
325 /* CONSTANT_NameAndType */
326 typedef struct forward_nameandtype {
327 struct forward_nameandtype *next;
331 } forward_nameandtype;
333 /* CONSTANT_Fieldref, CONSTANT_Methodref or CONSTANT_InterfaceMethodref */
334 typedef struct forward_fieldmethint {
335 struct forward_fieldmethint *next;
339 u2 nameandtype_index;
340 } forward_fieldmethint;
342 \caption{temporary constant pool structures}
343 \label{constantpoolstructures}
346 The \texttt{classinfo} structure has two pointers to arrays which
347 contain the class' constant pool infos, namely: \texttt{cptags} and
348 \texttt{cpinfos}. \texttt{cptags} contains the type of the constant
349 pool entry. \texttt{cpinfos} contains a pointer to the constant pool
350 entry itself. In the second pass the references are resolved and the
351 runtime structures are created. In further detail this includes for
354 \item \texttt{CONSTANT\_Class}: get the UTF8 name string of the
355 class, store type \texttt{CONSTANT\_Class} in \texttt{cptags}, create
356 a class in the class hashtable with the UTF8 name and store the
357 pointer to the new class in \texttt{cpinfos}
359 \item \texttt{CONSTANT\_String}: get the UTF8 string of the
360 referenced string, store type \texttt{CONSTANT\_String} in
361 \texttt{cptags} and store the UTF8 string pointer into
366 \item \texttt{CONSTANT\_NameAndType}: create a
367 \texttt{constant\_nameandtype} (Figure \ref{constantnameandtype})
368 structure, get the UTF8 name and description string of the field or
369 method and store them into the \texttt{constant\_nameandtype}
370 structure, store type \texttt{CONSTANT\_NameAndType} into
371 \texttt{cptags} and store a pointer to the
372 \texttt{constant\_nameandtype} structure into \texttt{cpinfos}
378 typedef struct { /* NameAndType (Field or Method) */
379 utf *name; /* field/method name */
380 utf *descriptor; /* field/method type descriptor string */
381 } constant_nameandtype;
383 \caption{\texttt{constant\_nameandtype} structure}
384 \label{constantnameandtype}
389 \item \texttt{CONSTANT\_Fieldref}, \texttt{CONSTANT\_Methodref} and
390 \texttt{CONSTANT\_InterfaceMethodref}: create a
391 \texttt{constant\_FMIref} (Figure \ref{constantFMIref}) structure,
392 get the referenced \texttt{constant\_nameandtype} structure which
393 contains the name and descriptor resolved in a previous step and
394 store the name and descriptor into the \texttt{constant\_FMIref}
395 structure, get the pointer of the referenced class, which was created
396 in a previous step, and store the pointer of the class into the
397 \texttt{constant\_FMIref} structure, store the type of the current
398 constant pool entry in \texttt{cptags} and store a pointer to
399 \texttt{constant\_FMIref} in \texttt{cpinfos}
405 typedef struct { /* Fieldref, Methodref and InterfaceMethodref */
406 classinfo *class; /* class containing this field/method/interface */
407 utf *name; /* field/method/interface name */
408 utf *descriptor; /* field/method/interface type descriptor string */
411 \caption{\texttt{constant\_FMIref} structure}
412 \label{constantFMIref}
417 Any UTF8 strings, \texttt{constant\_nameandtype} structures or
418 referenced classes are resolved with the
421 voidptr class_getconstant(classinfo *c, u4 pos, u4 ctype);
424 function. This functions checks for type equality and then returns the
425 requested \texttt{cpinfos} slot of the specified class.
428 \subsection{Interface loading}
430 Interface loading is very simple and straightforward. After reading
431 the number of interfaces, for every interface referenced, a
432 \texttt{u2} constant pool index is read from the currently loading
433 class or interface. This index is used to resolve the interface class
434 via the \texttt{class\_getconstant} function from the class' constant
435 pool. This means, interface \textit{loading} is more interface
436 \textit{resolving} than loading. The resolved interfaces are stored
437 in an \texttt{classinfo *} array allocated by the class loader. The
438 memory pointer of the array is assigned to the \texttt{interfaces}
439 field of the \texttt{clasinfo} structure.
442 \subsection{Field loading}
444 The number of fields of the class or interface is read as \texttt{u2}
445 value. For each field the function
448 static bool field_load(classbuffer *cb, classinfo *c, fieldinfo *f);
451 is called. The \texttt{fieldinfo *} argument is a pointer to a
452 \texttt{fieldinfo} structure (Figure \ref{fieldinfostructure})
453 allocated by the class loader. The fields' \texttt{name} and
454 \texttt{descriptor} are resolved from the class constant pool via
455 \texttt{class\_getconstant}. If the verifier option is turned on, the
456 fields' \texttt{flags}, \texttt{name} and \texttt{descriptor} are
457 checked for validity and can result in a
458 \texttt{java.lang.ClassFormatError}.
462 struct fieldinfo { /* field of a class */
463 s4 flags; /* ACC flags */
464 s4 type; /* basic data type */
465 utf *name; /* name of field */
466 utf *descriptor; /* JavaVM descriptor string of field */
468 s4 offset; /* offset from start of object (instance variables) */
470 imm_union value; /* storage for static values (class variables) */
472 classinfo *class; /* needed by typechecker. Could be optimized */
473 /* away by using constant_FMIref instead of */
474 /* fieldinfo throughout the compiler. */
478 \caption{\texttt{fieldinfo} structure}
479 \label{fieldinfostructure}
482 Each field can have some attributes. The number of attributes is read
483 as \texttt{u2} value from the binary representation. If the field has
484 the \texttt{ACC\_FINAL} bit set in the flags, the
485 \texttt{ConstantValue} attribute is available. This is the only
486 attribute processed by \texttt{field\_load} and can occur only once,
487 otherwise a \texttt{java.lang.ClassFormatError} is thrown. The
488 \texttt{ConstantValue} entry in the constant pool contains the value
489 for the \texttt{final} field. Depending on the fields' type, the
490 proper constant pool entry is resolved and assigned.
493 \subsection{Method loading}
495 As for the fields, the number of the class or interface methods is read from
496 the binary representation as \texttt{u2} value. For each method the function
499 static bool method_load(classbuffer *cb, classinfo *c, methodinfo *m);
502 is called. The beginning of the method loading code is nearly the same
503 as the field loading code. The \texttt{methodinfo *} argument is a
504 pointer to a \texttt{methodinfo} structure allocated by the class
505 loader. The method's \texttt{name} and \texttt{descriptor} are
506 resolved from the class constant pool via
507 \texttt{class\_getconstant}. With the verifier turned on, some method
508 checks are carried out. These include \texttt{flags}, \texttt{name}
509 and \texttt{descriptor} checks and argument count check.
513 struct methodinfo { /* method structure */
514 java_objectheader header; /* we need this in jit's monitorenter */
515 s4 flags; /* ACC flags */
516 utf *name; /* name of method */
517 utf *descriptor; /* JavaVM descriptor string of method */
519 bool isleafmethod; /* does method call subroutines */
521 classinfo *class; /* class, the method belongs to */
522 s4 vftblindex; /* index of method in virtual function */
523 /* table (if it is a virtual method) */
524 s4 maxstack; /* maximum stack depth of method */
525 s4 maxlocals; /* maximum number of local variables */
526 s4 jcodelength; /* length of JavaVM code */
527 u1 *jcode; /* pointer to JavaVM code */
529 s4 exceptiontablelength;/* exceptiontable length */
530 exceptiontable *exceptiontable; /* the exceptiontable */
532 u2 thrownexceptionscount;/* number of exceptions attribute */
533 classinfo **thrownexceptions; /* checked exceptions a method may throw */
535 u2 linenumbercount; /* number of linenumber attributes */
536 lineinfo *linenumbers; /* array of lineinfo items */
538 u1 *stubroutine; /* stub for compiling or calling natives */
542 \caption{\texttt{methodinfo} structure}
543 \label{methodinfostructure}
546 The method loading function has to distinguish between a
547 \texttt{native} and a ''normal'' JAVA method. Depending on the
548 \texttt{ACC\_NATIVE} flags, a different stub is created.
550 For a JAVA method, a \textit{compiler stub} is created. The purpose of
551 this stub is to call the CACAO jit compiler with a pointer to the byte
552 code of the JAVA method as argument to compile the method into machine
553 code. During code generation a pointer to this compiler stub routine
554 is used as a temporary method call, if the method is not compiled
555 yet. After the target method is compiled, the new entry point of the
556 method is patched into the generated code and the compiler stub is
557 needless, thus it is freed.
559 If the method is a \texttt{native} method, the loader tries to find
560 the native function. If the function was found, a \textit{native stub}
561 is generated. This stub is responsible to manipulate the method's
562 arguments to be suitable for the \texttt{native} method called. This
563 includes inserting the \textit{JNI environment} pointer as first
564 argument and, if the \texttt{native} method has the
565 \texttt{ACC\_STATIC} flag set, inserting a pointer to the methods
566 class as second argument. If the \texttt{native} method is
567 \texttt{static}, the native stub also checks if the method's class is
568 already initialized. If the method's class is not initialized as the
569 native stub is generated, a \texttt{asm\_check\_clinit} calling code
572 Each method can have some attributes. The method loading function
573 processes two of them: \texttt{Code} and \texttt{Exceptions}.
575 The \texttt{Code} attribute is a \textit{variable-length} attribute
576 which contains the Java Virtual Machine instructions---the byte
577 code---of the JAVA method. If the method is either \texttt{native} or
578 \texttt{abstract}, it must not have a \texttt{Code} attribute,
579 otherwise it must have exactly one \texttt{Code}
580 attribute. Additionally to the byte code, the \texttt{Code} attribute
581 contains the exception table and attributes to \texttt{Code} attribute
582 itself. One exception table entry contains the \texttt{start\_pc},
584 \texttt{handler\_pc} of the \texttt{try-catch} block, each read as
585 \texttt{u2} value, plus a reference to the class of the
586 \texttt{catch\_type}. Currently there are two attributes of the
587 \texttt{Code} attribute defined in the JVM specification:
588 \texttt{LineNumberTable} and \texttt{LocalVariableTable}. CACAO only
589 processes the \texttt{LineNumberTable} attribute. A
590 \texttt{LineNumberTable} entry consist of the \texttt{start\_pc} and
591 the \texttt{line\_number}, which are stored in a \texttt{lineinfo}
592 structure (Figure \ref{lineinfostructure}).
601 \caption{\texttt{lineinfo} structure}
602 \label{lineinfostructure}
605 The linenumber count and the memory pointer of the \texttt{lineinfo}
606 structure array are assigned to the \texttt{classinfo} fields
607 \texttt{linenumbercount} and \texttt{linenumbers} respectively.
609 The \texttt{Exceptions} attribute is a \textit{variable-length}
610 attribute and contains the checked exceptions the JAVA method may
611 throw. The \texttt{Exceptions} attribute consist of the count of
612 exceptions, which is stored in the \texttt{classinfo} field
613 \texttt{thrownexceptionscount}, and the adequate amount of \texttt{u2}
614 constant pool index values. The exception classes are resolved from
615 the constant pool and stored in an allocated \texttt{classinfo *}
616 array, whose memory pointer is assigned to the
617 \texttt{thrownexceptions} field of the \texttt{classinfo} structure.
619 Any attributes which are not processed by the CACAO class loading
620 system, are skipped via
623 static bool skipattributebody(classbuffer *cb);
626 which skips one attribute or
629 static bool skipattributes(classbuffer *cb, u4 num);
632 which skips a specified number \texttt{num} of attributes. If any
633 problem occurs in the method loading function, a
634 \texttt{java.lang.ClassFormatError} with a specific detail message is
638 \subsection{Attribute loading}
640 Attribute loading is done via the
643 static bool attribute_load(classbuffer *cb, classinfo *c, u4 num);
646 function. The currently loading class or interface can contain some
647 additional attributes which have not already been loaded. The CACAO
648 system class loader processes two of them: \texttt{InnerClasses} and
651 The \texttt{InnerClass} attribute is a \textit{variable-length}
652 attribute in the \texttt{attributes} table of the binary
653 representation of the class or interface. A \texttt{InnerClass} entry
654 contains the \texttt{inner\_class} constant pool index itself, the
655 \texttt{outer\_class} index, the \texttt{name} index of the inner
656 class' name and the inner class' \texttt{flags} bitmask. All these
657 values are read in \texttt{u2} chunks.
659 The constant pool indexes are used with the
662 voidptr innerclass_getconstant(classinfo *c, u4 pos, u4 ctype);
665 function call to resolve the classes or UTF8 strings. After resolving
666 is done, all values are stored in the \texttt{innerclassinfo}
667 structure (Figure \ref{innerclassinfostructure}).
671 struct innerclassinfo {
672 classinfo *inner_class; /* inner class pointer */
673 classinfo *outer_class; /* outer class pointer */
674 utf *name; /* innerclass name */
675 s4 flags; /* ACC flags */
678 \caption{\texttt{innerclassinfo} structure}
679 \label{innerclassinfostructure}
682 The other attribute, \texttt{SourceFile}, is just one \texttt{u2}
683 constant pool index value to get the UTF8 string reference of the
684 class' \texttt{SourceFile} name. The string pointer is assigned to the
685 \texttt{sourcefile} field of the \texttt{classinfo} structure.
687 Both attributes must occur only once. Other attributes than these two
688 are skipped with the earlier mentioned \texttt{skipattributebody}
691 After the attribute loading is done and no error occured, the
692 \texttt{class\_load\_intern} function returns the \texttt{classinfo}
693 pointer to signal that there was no problem. If \texttt{NULL} is
694 returned, there was an exception.
697 \section{Dynamic class loader}
700 \section{Eager - lazy class loading}
702 A Java Virtual Machine can implement two different algorithms for the
703 system class loader to load classes or interfaces: \textit{eager class
704 loading} and \textit{lazy class loading}.
707 \subsection{Eager class loading}
708 \label{sectioneagerclassloading}
710 The Java Virtual Machine initially creates, loads and links the class
711 of the main program with the system class loader. The creation of the
712 class is done via the \texttt{class\_new} function call (see section
713 \ref{sectionsystemclassloader}). In this function, with \textit{eager
714 loading} enabled, firstly the currently created class or interface is
715 loaded with \texttt{class\_load}. CACAO uses the \textit{eager class
716 loading} algorithm with the command line switch \texttt{-eager}. As
717 described in the ''Constant pool loading'' section (see
718 \ref{sectionconstantpoolloading}), the binary representation of a
719 class or interface contains references to other classes or
720 interfaces. With \textit{eager loading} enabled, referenced classes or
721 interfaces are loaded immediately.
723 If a class reference is found in the second pass of the constant pool
724 loading process, the class is created in the class hashtable with
725 \texttt{class\_new\_intern}. CACAO uses the intern function here
726 because the normal \texttt{class\_new} function, which is a wrapper
727 function, instantly tries to \textit{link} all referenced
728 classes. This must not happen until all classes or interfaces
729 referenced are loaded, otherwise the Java Virtual Machine gets into an
732 After the \texttt{classinfo} of the class referenced is created, the
733 class or interface is \textit{loaded} via the \texttt{class\_load}
734 function (described in more detail in section
735 \ref{sectionsystemclassloader}). When the class loading function
736 returns, the current referenced class or interface is added to a list
737 called \texttt{unlinkedclasses}, which contains all loaded but
738 unlinked classes referenced by the currently loaded class or
739 interface. This list is processed in the \texttt{class\_new} function
740 of the currently created class or interface after \texttt{class\_load}
741 returns. For each entry in the \texttt{unlinkedclasses} list,
742 \texttt{class\_link} is called which finally \textit{links} the class
743 (described in more detail in section \ref{sectionlinking}) and then
744 the class entry is removed from the list. When all referenced classes
745 or interfaces are linked, the currently created class or interface is
746 linked and the \texttt{class\_new} functions returns.
749 \subsection{Lazy class loading}
750 \label{sectionlazyclassloading}
752 With \textit{eager class loading}, usually it takes much more time for
753 a Java Virtual Machine to start a program as with \textit{lazy class
754 loading}. With \textit{eager class loading}, a typical
755 \texttt{HelloWorld} program needs 513 class loads with the current GNU
756 classpath CACAO is using. When using \textit{lazy class loading},
757 CACAO only needs 121 class loads for the same \texttt{HelloWorld}
758 program. This means with \textit{lazy class loading} CACAO needs to
759 load more than four times less class files. Furthermore CACAO does
760 also \textit{lazy class linking}, which saves much more run-time here.
762 CACAO's \textit{lazy class loading} implementation does not completely
763 follow the JVM specification. A Java Virtual Machine which implements
764 \textit{lazy class loading} should load and link requested classes or
765 interfaces at runtime. But CACAO does class loading and linking at
766 parse time, because of some problems not resolved yet. That means, if
767 a Java Virtual Machine instruction is parsed which uses any class or
768 interface references, like \texttt{JAVA\_PUTSTATIC},
769 \texttt{JAVA\_GETFIELD} or any \texttt{JAVA\_INVOKE*} instructions,
770 the referenced class or interface is loaded and linked immediately
771 during the parse pass of currently compiled method. This introduces
772 some incompatibilities with other Java Virtual Machines like Sun's
773 JVM, IBM's JVM or Kaffe.
775 Imagine a code snippet like this
778 void sub(boolean b) {
782 System.out.println("foobar");
786 If the function is called with \texttt{b} equal \texttt{false} and the
787 class file \texttt{A.class} does not exist, a Java Virtual Machine
788 should execute the code without any problems, print \texttt{foobar}
789 and exit the Java Virtual Machine with exit code 0. Due to the fact
790 that CACAO does class loading and linking at parse time, the CACAO
791 Virtual Machine throws an \texttt{java.lang.NoClassDefFoundError:~A}
792 exception which is not caught and thus discontinues the execution
793 without printing \texttt{foobar} and exits.
795 The CACAO development team has not yet a solution for this
796 problem. It's not trivial to move the loading and linking process from
797 the compilation phase into runtime, especially CACAO was initially
798 designed for \textit{eager class loading} and \textit{lazy class
799 loading} was implemented at a later time to optimize class loading and
800 to get a little closer to the JVM specification. \textit{Lazy class
801 loading} at runtime is one of the most important features to be
802 implemented in the future. It is essential to make CACAO a standard
803 compliant Java Virtual Machine.
807 \label{sectionlinking}
809 Linking is the process of preparing a previously loaded class or
810 interface to be used in the Java Virtual Machine's runtime
811 environment. The function which performs the linking in CACAO is
814 classinfo *class_link(classinfo *c);
817 This function, as for class loading, is just a wrapper function for
818 the main linking function
821 static classinfo *class_link_intern(classinfo *c);
824 This function should not be called directly and is thus declared as
825 \texttt{static}. The purposes of the wrapper function are
828 \item enter a monitor on the \texttt{classinfo} structure, so that is
829 guaranteed that only one thread can link the same class at the same
832 \item measure linking time if requested
834 \item check if the intern linking function has thrown an error or an
835 exception and reset the \texttt{linked} field of the
836 \texttt{classinfo} structure
838 \item leave the monitor
841 The \texttt{class\_link} function, like the \texttt{class\_load}
842 function, is implemented to be \textit{reentrant}. This must be the
843 case for the linking algorithm implemented in CACAO. Furthermore this
844 means that serveral threads can link different classes or interfaces
845 at the same time on multiprocessor machines.
847 The first step in the \texttt{class\_link\_intern} function is to set
848 the \texttt{linked} field of the currently linked \texttt{classinfo}
849 structure to \texttt{true}. This is essential, that the linker does
850 not try to link a class or interface again, while it's already in the
851 linking process. Such a case can occur because the linker also
852 processes the class' direct superclass and direct superinterfaces.
854 In CACAO's linker the direct superinterfaces are processed first. For
855 each interface in the \texttt{interfaces} field of the
856 \texttt{classinfo} structure is checked if there occured an
857 \texttt{java.lang.ClassCircularityError}, which happens when the
858 currently linked class or interface is equal the interface which
859 should be processed. Otherwise the interface is loaded and linked if
860 not already done. After the interface is loaded successfully, the
861 interface flags are checked for the \texttt{ACC\_INTERFACE} bit. If
862 this is not the case, a
863 \texttt{java.lang.IncompatibleClassChangeError} is thrown and
864 \texttt{class\_link\_intern} returns.
866 Then the direct superclass is handled. If the direct superclass is
867 equal \texttt{NULL}, we have the special case of linking
868 \texttt{java.lang.Object}. There are only set some \texttt{classinfo}
869 fields to special values for \texttt{java.lang.Object} like
873 c->instancesize = sizeof(java_objectheader);
878 If the direct superclass is non-\texttt{NULL}, CACAO firstly detects
879 class circularity as for interfaces. If no
880 \texttt{java.lang.ClassCircularityError} was thrown, the superclass is
881 loaded and linked if not already done before. Then some flag bits of
882 the superclass are checked: \texttt{ACC\_INTERFACE} and
883 \texttt{ACC\_FINAL}. If one of these bits is set an error is thrown.
885 If the currently linked class is an array, CACAO calls a special array
889 static arraydescriptor *class_link_array(classinfo *c);
892 This function firstly checks if the passed \texttt{classinfo} is an
893 \textit{array of arrays} or an \textit{array of objects}. In both
894 cases the component type is created in the class hashtable via
895 \texttt{class\_new} and then loaded and linked if not already
896 done. If none is the case, the passed array is a \textit{primitive
897 type array}. No matter of which type the array is, an
898 \texttt{arraydescriptor} structure (Figure
899 \ref{arraydescriptorstructure}) is allocated and filled with the
900 appropriate values of the given array type.
904 struct arraydescriptor {
905 vftbl_t *componentvftbl; /* vftbl of the component type, NULL for primit. */
906 vftbl_t *elementvftbl; /* vftbl of the element type, NULL for primitive */
907 s2 arraytype; /* ARRAYTYPE_* constant */
908 s2 dimension; /* dimension of the array (always >= 1) */
909 s4 dataoffset; /* offset of the array data from object pointer */
910 s4 componentsize; /* size of a component in bytes */
911 s2 elementtype; /* ARRAYTYPE_* constant */
914 \caption{\texttt{arraydescriptor} structure}
915 \label{arraydescriptorstructure}
918 After the \texttt{class\_link\_array} function call, the temporary
919 class \texttt{index} is calculated. For interfaces---classes with
920 \texttt{ACC\_INTERFACE} flag bit set---the class' \texttt{index} is
921 the global \texttt{interfaceindex} plus one. Any other classes get the
922 \texttt{index} of the superclass plus one.
924 Other \texttt{classinfo} fields are also set from the superclass like,
925 \texttt{instancesize}, \texttt{vftbllength} and the \texttt{finalizer}
926 function. All these values are temporary ones and can be overwritten
929 The next step in \texttt{class\_load\_intern} is to compute the
930 \textit{virtual function table length}. For each method in
931 \texttt{classinfo}'s \texttt{methods} field which has not the
932 \texttt{ACC\_STATIC} flag bit set, thus is an instance method, the
933 direct superclasses up to \texttt{java.lang.Object} are checked with
936 static bool method_canoverwrite(methodinfo *m, methodinfo *old);
939 if the current method can overwrite the superclass method, if there
940 exists one. If the found superclass method has the
941 \texttt{ACC\_PRIVATE} flag bit set, the current method's
942 \textit{virtual function table index} is the current \textit{virtual
943 function table length} plus one:
946 m->vftblindex = (vftbllength++);
949 If the current method has the \texttt{ACC\_FINAL} flag bit set, the
950 CACAO class linker throws a \texttt{java.lang.VerifyError}. Otherwise
951 the current method's \textit{virtual function table index} is the same
952 as the index from the superclass method:
955 m->vftblindex = tc->methods[j].vftblindex;
958 After processing the \textit{virtual function table length}, the CACAO
959 linker computes the \textit{interface table length}. For the current
960 class' and every superclass' interfaces, the function
963 static s4 class_highestinterface(classinfo *c);
966 is called. This function computes the highest interface \texttt{index}
967 of the passed interface and returns the value. This is done by
968 recursively calling \texttt{class\_highestinterface} with each
969 interface from the passed interface. The highest \texttt{index} value
970 found is the \textit{interface table length} of the currently linking
973 Now that the linker has completely computed the size of the
974 \textit{virtual function table}, the memory can be allocated, casted
975 to an \texttt{vftbl} structure (Figure \ref{vftblstructure}) and
976 filled with the previously calculated values.
981 methodptr *interfacetable[1]; /* interface table (access via macro) */
983 classinfo *class; /* class, the vtbl belongs to */
985 arraydescriptor *arraydesc; /* for array classes, otherwise NULL */
987 s4 vftbllength; /* virtual function table length */
988 s4 interfacetablelength; /* interface table length */
990 s4 baseval; /* base for runtime type check */
991 /* (-index for interfaces) */
992 s4 diffval; /* high - base for runtime type check */
994 s4 *interfacevftbllength; /* length of interface vftbls */
996 methodptr table[1]; /* class vftbl */
999 \caption{\texttt{vftbl} structure}
1000 \label{vftblstructure}
1003 Some important values are
1006 c->header.vftbl = c->vftbl = v;
1008 v->vftbllength = vftbllength;
1009 v->interfacetablelength = interfacetablelength;
1010 v->arraydesc = arraydesc;
1013 If the currently linked class is an interface, the \texttt{baseval} of
1014 the interface's \textit{virtual function table} is set to
1015 \texttt{-(c->index)}. Then the \textit{virtual function table} of the
1016 direct superclass is copied into the \texttt{table} field of the
1017 current \textit{virtual function table} and for each
1018 non-\texttt{static} method in the current's class or interface
1019 \texttt{methods} field, the pointer to the \textit{stubroutine} of the
1020 method in stored in the \textit{virtual function table}.
1022 Now the fields of the currently linked class or interface are
1023 processed. The CACAO linker computes the instance size of the class or
1024 interface and the offset of each field inside. For each field in the
1025 \texttt{classinfo} field \texttt{fields} which is non-\texttt{static},
1026 the type-size is resolved via the \texttt{desc\_typesize} function
1027 call. Then a new \texttt{instancesize} is calculated with
1030 c->instancesize = ALIGN(c->instancesize, dsize);
1033 which does memory alignment suitable for the next field. This newly
1034 computed \texttt{instancesize} is the \texttt{offset} of the currently
1035 processed field. The type-size is then added to get the real
1036 \texttt{instancesize}.
1038 The next step of the CACAO linker is to initialize the \textit{virtual
1039 function table} fields \texttt{interfacevftbllength} and
1040 \texttt{interfacetable}. For \texttt{interfacevftbllength} an
1041 \texttt{s4} array of \texttt{interfacetablelength} elements is
1042 allocated. Each \texttt{interfacevftbllength} element is initialized
1043 with \texttt{0} and the elements in \texttt{interfacetable} with
1044 \texttt{NULL}. After the initialization is done, the interfaces of the
1045 currently linked class and all it's superclasses, up to
1046 \texttt{java.lang.Object}, are processed via the
1049 static void class_addinterface(classinfo *c, classinfo *ic);
1052 function call. This function adds the methods of the passed interface
1053 to the \textit{virtual function table} of the passed class or
1054 interface. If the method count of the passed interface is zero, the
1055 function adds a method fake entry, which is needed for subtype
1059 v->interfacevftbllength[i] = 1;
1060 v->interfacetable[-i] = MNEW(methodptr, 1);
1061 v->interfacetable[-i][0] = NULL;
1064 \texttt{i} represents the \texttt{index} of the passed interface
1065 \texttt{ic}, \texttt{v} the \textit{virtual function table} of the
1066 passed class or interface \texttt{c}.
1068 If the method count is non-zero, an \texttt{methodptr} array of
1069 \texttt{ic->methodscount} elements is allocated and the method count
1070 value is stored in the particular position of the
1071 \texttt{interfacevftbllength} array:
1074 v->interfacevftbllength[i] = ic->methodscount;
1075 v->interfacetable[-i] = MNEW(methodptr, ic->methodscount);
1078 For each method of the interface passed, the methods of the target
1079 class or interface passed and all superclass methods are checked if
1080 they can overwrite the interface method via
1081 \texttt{method\_canoverwrite}. If the function returns \texttt{true},
1082 the corresponding function is resolved from the
1083 \texttt{table} field of the \textit{virtual function table} and stored
1084 it the particular position of the \texttt{interfacetable}:
1087 v->interfacetable[-i][j] = v->table[mi->vftblindex];
1090 The \texttt{class\_addinterface} function is also called recursively
1091 for all interfaces the interface passed implements.
1093 After the interfaces were added and the currently linked class or
1094 interface is not \texttt{java.lang.Object}, the CACAO linker tries to
1095 find a function which name and descriptor matches
1096 \texttt{finalize()V}. If an appropriate function was found and the
1097 function is non-\texttt{static}, it is assigned to the
1098 \texttt{finalizer} field of the \texttt{classinfo} structure.
1101 \section{Initialization}