+The x87 \textit{floating-point unit} (FPU) implementation is
+completely compatible to the IA32 implementation, since the i386 with
+its i387 coproccessor, with all the advantages and drawbacks, like the
+8 slot FPU stack.
+
+The SSE/SSE2 technique is taken from the newest generation of Intel
+processors, introduced with Intel's Pentium 4, and can process scalar
+32-bit \texttt{float} values and scalar 64-bit \texttt{double} values
+in the 128-bit wide \texttt{xmm} floating-point registers. While SSE
+instructions operate on 32-bit \texttt{float} values, SSE2 is
+responsible for 64-bit \texttt{double} values. In CACAO we implemented
+the JAVA floating-point instructions using SSE/SSE2, because SSE/SSE2
+is much easier to use and should be the technique of the future. In
+some areas SSE/SSE2 is slower than the old x87 implementation, even on
+the new designed AMD64 architecture, but SSE/SSE2 offers 16
+floating-point registers, which should speed up daily JAVA
+floating-point calculations. Another big advantage of SSE/SSE2 to x87
+is the missing \textit{single-double precision-rounding} problem, as
+described in detail in the ``IA32 code generator'' section. With
+SSE/SSE2 the 32-bit \texttt{float} and 64-bit \texttt{double}
+arithmetic is calculated and rounded completely IEEE 754 compliant, so
+no further adjustments need to take place to fullfil JAVAs
+floating-point requirements.
+
+In floating-point value to integer value conversions a JVM has to
+check for corner cases as described in the JVM specification. This is
+done via a simple inline integer compare of the integer result value
+and a call to special assembler wrapper functions for builtin calls,
+like \texttt{asm\_builtin\_f2i} for \texttt{ICMD\_F2I} ---
+\texttt{float} to \texttt{int} conversion. These corner cases are then
+computed in a builtin C function with respect to all special cases
+like \textit{Infinite} or \textit{NaN} values.
+
+
+\subsection{Exception handling}
+
+Since the AMD64 architecture is just an extension to the IA32
+architecture, an AMD64 processor itself raises the same signals as an
+IA32 processor, so we can catch the same signals in our own signal
+handlers. This includes the signals \texttt{SIGSEGV} and
+\texttt{SIGFPE}.
+
+When a signal of this type is raised and the signal hits our signal
+handler, we reinstall the handler, create a new exception object and
+jump to a---in assembler---written exception handling code. The
+difference to the exception handling code of RISC machines, is the
+fact that RISC machines have a \textit{procedure vector} (PV)
+register. So it's easy to find the methods' data segment, which starts
+at the PV growing down to smaller addresses like a stack. For the IA32
+and AMD64 architecture we had to implement a \textit{method tree}
+which contains the start \textit{program counter} (PC) and the end PC
+for every single JAVA method compiled in CACAO, to find for any
+exception PC the corresponding method and thus the PV. We need the
+data segment for the methods' exception table (for a detailed
+description see section ''Exception handling'').
+
+We use \texttt{SIGSEGV} for \textit{hardware null-pointer checking},
+so we can handle this common exception as fast as possible in
+CACAO. The signal handler creates a
+\texttt{java.lang.NullPointerException}.
+
+\texttt{SIGFPE} is used to catch integer division by zero exceptions
+in hardware. The signal handler generates a
+\texttt{java.lang.ArithmeticException} with \texttt{/ by zero} as detail
+message.
+
+Both exceptions are handled in hardware by default, but they can also
+be catched in software when using CACAOs commandline switch
+\texttt{-softnull}. On the RISC ports only the \textit{null-pointer
+exception} is checked in software when using this switch, but on IA32
+and AMD64 both are checked, \texttt{SIGSEGV} and \texttt{SIGFPE}.
+
+
+\subsection{Related work}
+
+The AMD64 architecture is a reasonably young architecture, released in
+April 2003. At the writing of this document the only available 64-bit
+operating systems for AMD64 are GNU/Linux---from different
+distributors---, FreeBSD, NetBSD and OpenBSD. Microsoft Windows is not
+available yet, although it was announced to be released in the first
+half of 2004.
+
+The first available 64-bit JVM for the AMD64 architecture was GCC's
+GCJ---The GNU Compiler for the Java Programming
+Language~\cite{GCJ}. \texttt{gcj} itself is a portable, optimizing,
+ahead-of-time compiler for the JAVA Programming Language, which can
+compile:
+
+\begin{itemize}
+\item JAVA source code directly to native machine code
+\item JAVA source code to JAVA bytecode (class files)
+\item JAVA bytecode to native machine code
+\end{itemize}
+
+One part of the GCJ is \texttt{gij}, which is the JVM
+interpreter. Much of the porting effort for the \textit{GNU Compiler
+Collection} to the AMD64 architecture was done by people working at
+SUSE~\cite{SUSE}.
+
+Long time no AMD64 JIT was available, till Sun~\cite{Sun} released
+their AMD64 version of J2SE 1.4.2-rc1 for GNU/Linux by
+Blackdown~\cite{Blackdown} in December 2003. At this time our AMD64
+JIT was already working for months, but we were not able to release
+CACAO, because of the common status of CACAO to be a compliant
+JVM. The Sun JVM uses the HotSpot Server VM by default, the HotSpot
+Client VM is not available for AMD64 at this time.
+
+The Kaffe~\cite{Wilkinson:97} JVM has ported their interpreter to the
+AMD64 architecture for GNU/Linux, but they still have no plans to port
+their JIT.