6 A \textit{Java Virtual Machine} (JVM) dynamically loads, links and
7 initializes classes and interfaces when they are needed. Loading a
8 class or interface means locating the binary representation---the
9 class files---and creating a class of interface structure from that
10 binary representation. Linking takes a loaded class or interface and
11 transfers it into the runtime state of the \textit{Java Virtual
12 Machine} so that it can be executed. Initialization of a class or
13 interface means executing the static class of interface initializer
16 The following sections describe the process of loading, linking and
17 initalizing a class or interface in the CACAO \textit{Java Virtual
18 Machine} in greater detail. Further the used data structures and
19 techniques used in CACAO and the interaction with the GNU classpath
23 \section{System class loader}
25 The class loader of a \textit{Java Virtual Machine} (JVM) is
26 responsible for loading all type of classes and interfaces into the
27 runtime system of the JVM. Every JVM has a \textit{system class
28 loader} which is implemented in \texttt{java.lang.ClassLoader} and
29 this class interacts via native function calls with the JVM itself.
33 The \textit{GNU classpath} implements the system class loader in
34 \texttt{gnu.java.lang.SystemClassLoader} which extends
35 \texttt{java.lang.ClassLoader} and interacts with the JVM. The
36 \textit{bootstrap class loader} is implemented in
37 \texttt{java.lang.ClassLoader} plus the JVM depended class
38 \texttt{java.lang.VMClassLoader}. \texttt{java.lang.VMClassLoader} is
39 the main class how the bootstrap class loader of the GNU classpath
40 interacts with the JVM. The main functions of this class is
45 static final native Class loadClass(String name, boolean resolve)
46 throws ClassNotFoundException;
51 This is a native function implemented in the CACAO JVM, which is
52 located in \texttt{nat/VMClassLoader.c} and calls the internal loader
53 functions of CACAO. If the \texttt{name} argument is \texttt{NULL}, a
54 new \texttt{java.lang.NullPointerException} is created and the
55 function returns \texttt{NULL}.
59 If the \texttt{name} is non-NULL a new UTF8 string of the class' name
60 is created in the internal \textit{symbol table} via
63 utf *javastring_toutf(java_lang_String *string, bool isclassname);
66 This function converts a \texttt{java.lang.String} string into the
67 internal used UTF8 string representation. \texttt{isclassname} tells
68 the function to convert any \texttt{.} (periods) found in the class
69 name into \texttt{/} (slashes), so the class loader can find the
72 Then a new \texttt{classinfo} structure is created via the
75 classinfo *class_new(utf *classname);
78 function call. This function creates a unique representation of this
79 class, identified by its name, in the JVM's internal \textit{class
80 hashtable}. The newly created \texttt{classinfo} structure (Figure
81 \ref{classinfostructure}) is initialized with correct values, like
82 \texttt{loaded = false;}, \texttt{linked = false;} and
83 \texttt{initialized = false;}. This guarantees a definite state of a
88 struct classinfo { /* class structure */
90 s4 flags; /* ACC flags */
91 utf *name; /* class name */
93 s4 cpcount; /* number of entries in constant pool */
94 u1 *cptags; /* constant pool tags */
95 voidptr *cpinfos; /* pointer to constant pool info structures */
97 classinfo *super; /* super class pointer */
99 s4 interfacescount; /* number of interfaces */
100 classinfo **interfaces; /* pointer to interfaces */
102 s4 fieldscount; /* number of fields */
103 fieldinfo *fields; /* field table */
105 s4 methodscount; /* number of methods */
106 methodinfo *methods; /* method table */
108 bool initialized; /* true, if class already initialized */
109 bool initializing; /* flag for the compiler */
110 bool loaded; /* true, if class already loaded */
111 bool linked; /* true, if class already linked */
112 s4 index; /* hierarchy depth (classes) or index */
114 s4 instancesize; /* size of an instance of this class */
115 #ifdef SIZE_FROM_CLASSINFO
116 s4 alignedsize; /* size of an instance, aligned to the */
117 /* allocation size on the heap */
120 vftbl_t *vftbl; /* pointer to virtual function table */
122 methodinfo *finalizer; /* finalizer method */
124 u2 innerclasscount; /* number of inner classes */
125 innerclassinfo *innerclass;
127 utf *packagename; /* full name of the package */
128 utf *sourcefile; /* classfile name containing this class */
129 java_objectheader *classloader; /* NULL for bootstrap classloader */
132 \caption{\texttt{classinfo} structure}
133 \label{classinfostructure}
136 The next step is to actually load the class requested. Thus the main
140 classinfo *class_load(classinfo *c);
143 is called, which is a wrapper function to the real loader function
146 classinfo *class_load_intern(classbuffer *cb);
149 This wrapper function is required to ensure some requirements:
152 \item enter a monitor on the \texttt{classinfo} structure, so that
153 only one thread can load the same class at the same time
155 \item initialize the \texttt{classbuffer} structure with the actual
158 \item remove the \texttt{classinfo} structure from the internal table
159 if we got an exception during loading
161 \item free any allocated memory and leave the monitor
164 The \texttt{class\_load\_intern} functions preforms the actual loading
165 of the binary representation of the class or interface. During loading
166 some verifier checks are performed which can throw an error. This
167 error can be a \texttt{java.lang.ClassFormatError} or a
168 \texttt{java.lang.NoClassDefFoundError}. Some of these
169 \texttt{java.lang.ClassFormatError} checks are
172 \item \textit{Truncated class file} --- unexpected end of class file
175 \item \textit{Bad magic number} --- class file does not start with
176 the magic bytes (\texttt{0xCAFEBABE})
178 \item \textit{Unsupported major.minor version} --- the bytecode
179 version of the given class file is not supported by the JVM
182 The actual loading of the bytes from the binary representation is done
183 via the \texttt{suck\_*} functions. These functions are
186 \item \texttt{suck\_u1}: load one \texttt{unsigned byte} (8 bit)
188 \item \texttt{suck\_u2}: load two \texttt{unsigned byte}s (16 bit)
190 \item \texttt{suck\_u4}: load four \texttt{unsigned byte}s (32 bit)
192 \item \texttt{suck\_u8}: load eight \texttt{unsigned byte}s (64 bit)
194 \item \texttt{suck\_float}: load four \texttt{byte}s (32 bit)
195 converted into a \texttt{float} value
197 \item \texttt{suck\_double}: load eight \texttt{byte}s (64 bit)
198 converted into a \texttt{double} value
200 \item \texttt{suck\_nbytes}: load \textit{n} bytes
203 Loading \texttt{signed} values is done via the
204 \texttt{suck\_s[1,2,4,8]} macros which cast the loaded bytes to
205 \texttt{signed} values. All these functions take a
206 \texttt{classbuffer} (Figure \ref{classbufferstructure}) structure
211 typedef struct classbuffer {
212 classinfo *class; /* pointer to classinfo structure */
213 u1 *data; /* pointer to byte code */
214 s4 size; /* size of the byte code */
215 u1 *pos; /* current read position */
218 \caption{\texttt{classbuffer} structure}
219 \label{classbufferstructure}
222 This \texttt{classbuffer} structure is filled with data via the
225 classbuffer *suck_start(classinfo *c);
228 function. This function tries to locate the class, specifed with the
229 \texttt{classinfo} structure, in the \texttt{CLASSPATH}. This can be
230 a plain class file in the filesystem or a file in a
231 \texttt{zip}/\texttt{jar} file. If the class file is found, the
232 \texttt{classbuffer} is filled with data collected from the class
233 file, including the class file size and the binary representation of
236 Before reading any byte of the binary representation with a
237 \texttt{suck\_*} function, the remaining bytes in the
238 \texttt{classbuffer} data array must be checked with the
241 static inline bool check_classbuffer_size(classbuffer *cb, s4 len);
244 function. If the remaining bytes number is less than the amount of the
245 bytes to be read, specified by the \texttt{len} argument, a
246 \texttt{java.lang.ClassFormatError} with the detail message
247 \textit{Truncated class file}---as mentioned before---is thrown.
249 The following subsections describe chronologically in greater detail
250 the individual loading steps of a class or interface from it's binary
254 \subsection{Constant pool loading}
256 The class' constant pool is loaded via
259 static bool class_loadcpool(classbuffer *cb, classinfo *c);
262 from the \texttt{constant\_pool} table in the binary representation of
263 the class of interface. The constant pool needs to be parsed in two
264 passes. In the first pass the information loaded is saved in temporary
265 structures, which are further processed in the second pass, when the
266 complete constant pool has been traversed. Only when the whole
267 constant pool entries have been loaded, any constant pool entry can be
268 completely resolved, but this resolving can only be done in a specific
272 \item \texttt{CONSTANT\_Class}
274 \item \texttt{CONSTANT\_String}
276 \item \texttt{CONSTANT\_NameAndType}
278 \item \texttt{CONSTANT\_Fieldref}, \texttt{CONSTANT\_Methodref} and
279 \texttt{CONSTANT\_InterfaceMethodref} --- these are combined into one
285 The remaining constant pool types \texttt{CONSTANT\_Integer},
286 \texttt{CONSTANT\_Float}, \texttt{CONSTANT\_Long},
287 \texttt{CONSTANT\_Double} and \texttt{CONSTANT\_Utf8} can be
288 completely resolved in the first pass and need no further processing.
292 The temporary structures, shown in Figure
293 \ref{constantpoolstructures}, are used to \textit{forward} the data
294 from the first pass into the second.
298 /* CONSTANT_Class entries */
299 typedef struct forward_class {
300 struct forward_class *next;
305 /* CONSTANT_String */
306 typedef struct forward_string {
307 struct forward_string *next;
312 /* CONSTANT_NameAndType */
313 typedef struct forward_nameandtype {
314 struct forward_nameandtype *next;
318 } forward_nameandtype;
320 /* CONSTANT_Fieldref, CONSTANT_Methodref or CONSTANT_InterfaceMethodref */
321 typedef struct forward_fieldmethint {
322 struct forward_fieldmethint *next;
326 u2 nameandtype_index;
327 } forward_fieldmethint;
329 \caption{temporary constant pool structures}
330 \label{constantpoolstructures}
333 The \texttt{classinfo} structure has two pointers to arrays which
334 contain the class' constant pool infos, namely: \texttt{cptags} and
335 \texttt{cpinfos}. \texttt{cptags} contains the type of the constant
336 pool entry. \texttt{cpinfos} contains a pointer to the constant pool
337 entry itself. In the second pass the references are resolved and the
338 runtime structures are created. In further detail this includes for
341 \item \texttt{CONSTANT\_Class}: get the UTF8 name string of the
342 class, store type \texttt{CONSTANT\_Class} in \texttt{cptags}, create
343 a class in the class hashtable with the UTF8 name and store the
344 pointer to the new class in \texttt{cpinfos}
346 \item \texttt{CONSTANT\_String}: get the UTF8 string of the
347 referenced string, store type \texttt{CONSTANT\_String} in
348 \texttt{cptags} and store the UTF8 string pointer into
353 \item \texttt{CONSTANT\_NameAndType}: create a
354 \texttt{constant\_nameandtype} (Figure \ref{constantnameandtype})
355 structure, get the UTF8 name and description string of the field or
356 method and store them into the \texttt{constant\_nameandtype}
357 structure, store type \texttt{CONSTANT\_NameAndType} into
358 \texttt{cptags} and store a pointer to the
359 \texttt{constant\_nameandtype} structure into \texttt{cpinfos}
365 typedef struct { /* NameAndType (Field or Method) */
366 utf *name; /* field/method name */
367 utf *descriptor; /* field/method type descriptor string */
368 } constant_nameandtype;
370 \caption{\texttt{constant\_nameandtype} structure}
371 \label{constantnameandtype}
376 \item \texttt{CONSTANT\_Fieldref}, \texttt{CONSTANT\_Methodref} and
377 \texttt{CONSTANT\_InterfaceMethodref}: create a
378 \texttt{constant\_FMIref} (Figure \ref{constantFMIref}) structure,
379 get the referenced \texttt{constant\_nameandtype} structure which
380 contains the name and descriptor resolved in a previous step and
381 store the name and descriptor into the \texttt{constant\_FMIref}
382 structure, get the pointer of the referenced class, which was created
383 in a previous step, and store the pointer of the class into the
384 \texttt{constant\_FMIref} structure, store the type of the current
385 constant pool entry in \texttt{cptags} and store a pointer to
386 \texttt{constant\_FMIref} in \texttt{cpinfos}
392 typedef struct { /* Fieldref, Methodref and InterfaceMethodref */
393 classinfo *class; /* class containing this field/method/interface */
394 utf *name; /* field/method/interface name */
395 utf *descriptor; /* field/method/interface type descriptor string */
398 \caption{\texttt{constant\_FMIref} structure}
399 \label{constantFMIref}
404 Any UTF8 strings, \texttt{constant\_nameandtype} structures or
405 referenced classes are resolved with the
408 voidptr class_getconstant(classinfo *c, u4 pos, u4 ctype);
411 function. This functions checks for type equality and then returns the
412 requested \texttt{cpinfos} slot of the specified class.
415 \subsection{Interface loading}
417 Interface loading is very simple and straightforward. After reading
418 the number of interfaces, for every interface referenced, a
419 \texttt{u2} constant pool index is read from the currently loading
420 class or interface. This index is used to resolve the interface class
421 via the \texttt{class\_getconstant} function from the class' constant
422 pool. This means, interface \textit{loading} is more interface
423 \textit{resolving} than loading. The resolved interfaces are stored
424 in an \texttt{classinfo *} array allocated by the class loader. The
425 memory pointer of the array is assigned to the \texttt{interfaces}
426 field of the \texttt{clasinfo} structure.
429 \subsection{Field loading}
431 The number of fields of the class or interface is read as \texttt{u2}
432 value. For each field the function
435 static bool field_load(classbuffer *cb, classinfo *c, fieldinfo *f);
438 is called. The \texttt{fieldinfo *} argument is a pointer to a
439 \texttt{fieldinfo} structure (Figure \ref{fieldinfostructure})
440 allocated by the class loader. The fields' \texttt{name} and
441 \texttt{descriptor} are resolved from the class constant pool via
442 \texttt{class\_getconstant}. If the verifier option is turned on, the
443 fields' \texttt{flags}, \texttt{name} and \texttt{descriptor} are
444 checked for validity and can result in a
445 \texttt{java.lang.ClassFormatError}.
449 struct fieldinfo { /* field of a class */
450 s4 flags; /* ACC flags */
451 s4 type; /* basic data type */
452 utf *name; /* name of field */
453 utf *descriptor; /* JavaVM descriptor string of field */
455 s4 offset; /* offset from start of object (instance variables) */
457 imm_union value; /* storage for static values (class variables) */
459 classinfo *class; /* needed by typechecker. Could be optimized */
460 /* away by using constant_FMIref instead of */
461 /* fieldinfo throughout the compiler. */
465 \caption{\texttt{fieldinfo} structure}
466 \label{fieldinfostructure}
469 Each field can have some attributes. The number of attributes is read
470 as \texttt{u2} value from the binary representation. If the field has
471 the \texttt{ACC\_FINAL} bit set in the flags, the
472 \texttt{ConstantValue} attribute is available. This is the only
473 attribute processed by \texttt{field\_load} and can occur only once,
474 otherwise a \texttt{java.lang.ClassFormatError} is thrown. The
475 \texttt{ConstantValue} entry in the constant pool contains the value
476 for the \texttt{final} field. Depending on the fields' type, the
477 proper constant pool entry is resolved and assigned.
480 \subsection{Method loading}
482 As for the fields, the number of the class or interface methods is read from
483 the binary representation as \texttt{u2} value. For each method the function
486 static bool method_load(classbuffer *cb, classinfo *c, methodinfo *m);
489 is called. The beginning of the method loading code is nearly the same
490 as the field loading code. The \texttt{methodinfo *} argument is a
491 pointer to a \texttt{methodinfo} structure allocated by the class
492 loader. The method's \texttt{name} and \texttt{descriptor} are
493 resolved from the class constant pool via
494 \texttt{class\_getconstant}. With the verifier turned on, some method
495 checks are carried out. These include \texttt{flags}, \texttt{name}
496 and \texttt{descriptor} checks and argument count check.
500 struct methodinfo { /* method structure */
501 java_objectheader header; /* we need this in jit's monitorenter */
502 s4 flags; /* ACC flags */
503 utf *name; /* name of method */
504 utf *descriptor; /* JavaVM descriptor string of method */
506 bool isleafmethod; /* does method call subroutines */
508 classinfo *class; /* class, the method belongs to */
509 s4 vftblindex; /* index of method in virtual function */
510 /* table (if it is a virtual method) */
511 s4 maxstack; /* maximum stack depth of method */
512 s4 maxlocals; /* maximum number of local variables */
513 s4 jcodelength; /* length of JavaVM code */
514 u1 *jcode; /* pointer to JavaVM code */
516 s4 exceptiontablelength;/* exceptiontable length */
517 exceptiontable *exceptiontable; /* the exceptiontable */
519 u2 thrownexceptionscount;/* number of exceptions attribute */
520 classinfo **thrownexceptions; /* checked exceptions a method may throw */
522 u2 linenumbercount; /* number of linenumber attributes */
523 lineinfo *linenumbers; /* array of lineinfo items */
525 u1 *stubroutine; /* stub for compiling or calling natives */
529 \caption{\texttt{methodinfo} structure}
530 \label{methodinfostructure}
533 The method loading function has to distinguish between a
534 \texttt{native} and a ''normal'' JAVA method. Depending on the
535 \texttt{ACC\_NATIVE} flags, a different stub is created.
537 For a JAVA method, a \textit{compiler stub} is created. The purpose of
538 this stub is to call the CACAO jit compiler with a pointer to the byte
539 code of the JAVA method as argument to compile the method into machine
540 code. During code generation a pointer to this compiler stub routine
541 is used as a temporary method call, if the method is not compiled
542 yet. After the target method is compiled, the new entry point of the
543 method is patched into the generated code and the compiler stub is
544 needless, thus it is freed.
546 If the method is a \texttt{native} method, the loader tries to find
547 the native function. If the function was found, a \textit{native stub}
548 is generated. This stub is responsible to manipulate the method's
549 arguments to be suitable for the \texttt{native} method called. This
550 includes inserting the \textit{JNI environment} pointer as first
551 argument and, if the \texttt{native} method has the
552 \texttt{ACC\_STATIC} flag set, inserting a pointer to the methods
553 class as second argument. If the \texttt{native} method is
554 \texttt{static}, the native stub also checks if the method's class is
555 already initialized. If the method's class is not initialized as the
556 native stub is generated, a \texttt{asm\_check\_clinit} calling code
559 Each method can have some attributes. The method loading function
560 processes two of them: \texttt{Code} and \texttt{Exceptions}.
562 The \texttt{Code} attribute is a \textit{variable-length} attribute
563 which contains the Java Virtual Machine instructions---the byte
564 code---of the JAVA method. If the method is either \texttt{native} or
565 \texttt{abstract}, it must not have a \texttt{Code} attribute,
566 otherwise it must have exactly one \texttt{Code}
567 attribute. Additionally to the byte code, the \texttt{Code} attribute
568 contains the exception table and attributes to \texttt{Code} attribute
569 itself. One exception table entry contains the \texttt{start\_pc},
571 \texttt{handler\_pc} of the \texttt{try-catch} block, each read as
572 \texttt{u2} value, plus a reference to the class of the
573 \texttt{catch\_type}. Currently there are two attributes of the
574 \texttt{Code} attribute defined in the JVM specification:
575 \texttt{LineNumberTable} and \texttt{LocalVariableTable}. CACAO only
576 processes the \texttt{LineNumberTable} attribute. A
577 \texttt{LineNumberTable} entry consist of the \texttt{start\_pc} and
578 the \texttt{line\_number}, which are stored in a \texttt{lineinfo}
579 structure (Figure \ref{lineinfostructure}).
588 \caption{\texttt{lineinfo} structure}
589 \label{lineinfostructure}
592 The linenumber count and the memory pointer of the \texttt{lineinfo}
593 structure array are assigned to the \texttt{classinfo} fields
594 \texttt{linenumbercount} and \texttt{linenumbers} respectively.
596 The \texttt{Exceptions} attribute is a \textit{variable-length}
597 attribute and contains the checked exceptions the JAVA method may
598 throw. The \texttt{Exceptions} attribute consist of the count of
599 exceptions, which is stored in the \texttt{classinfo} field
600 \texttt{thrownexceptionscount}, and the adequate amount of \texttt{u2}
601 constant pool index values. The exception classes are resolved from
602 the constant pool and stored in an allocated \texttt{classinfo *}
603 array, whose memory pointer is assigned to the \texttt{classinfo}
604 field \texttt{thrownexceptions}.
606 Any attributes which are not processed by the CACAO class loading
607 system, are skipped via
610 static bool skipattributebody(classbuffer *cb);
613 which skips one attribute or
616 static bool skipattributes(classbuffer *cb, u4 num);
619 which skips a specified number \texttt{num} of attributes. If any
620 problem occurs in the method loading function, a
621 \texttt{java.lang.ClassFormatError} with a specific detail message is
625 \subsection{Attribute loading}
627 Attribute loading is done via the
630 static bool attribute_load(classbuffer *cb, classinfo *c, u4 num);
633 function. The currently loading class or interface can contain some
634 additional attributes which have not already been loaded. The CACAO
635 system class loader processes two of them: \texttt{InnerClasses} and
638 The \texttt{InnerClass} attribute is a \textit{variable-length}
639 attribute in the \texttt{attributes} table of the binary
640 representation of the class or interface. A \texttt{InnerClass} entry
641 contains the \texttt{inner\_class} constant pool index itself, the
642 \texttt{outer\_class} index, the \texttt{name} index of the inner
643 class' name and the inner class' \texttt{flags} bitmask. All these
644 values are read in \texttt{u2} chunks.
646 The constant pool indexes are used with the
649 voidptr innerclass_getconstant(classinfo *c, u4 pos, u4 ctype);
652 function call to resolve the classes or UTF8 strings. After resolving
653 is done, all values are stored in the \texttt{innerclassinfo}
654 structure (Figure \ref{innerclassinfostructure}).
658 struct innerclassinfo {
659 classinfo *inner_class; /* inner class pointer */
660 classinfo *outer_class; /* outer class pointer */
661 utf *name; /* innerclass name */
662 s4 flags; /* ACC flags */
665 \caption{\texttt{innerclassinfo} structure}
666 \label{innerclassinfostructure}
669 The other attribute, \texttt{SourceFile}, is just one \texttt{u2}
670 constant pool index value to get the UTF8 string reference of the
671 class' \texttt{SourceFile} name. The string pointer is assigned to the
672 \texttt{sourcefile} field of the \texttt{classinfo} structure.
674 Both attributes must occur only once. Other attributes than these two
675 are skipped with the earlier mentioned \texttt{skipattributebody}
678 After the attribute loading is done and no error occured, the
679 \texttt{class\_load\_intern} function returns the \texttt{classinfo}
680 pointer to signal that there was no problem. If \texttt{NULL} is
681 returned, there was an exception.
684 \section{Dynamic class loader}
686 \section{Eager - lazy class loading}
690 \section{Initialization}